==================================================================== Float ==================================================================== FriCAS provides two kinds of floating point numbers. The domain Float implements a model of arbitrary precision floating point numbers. The domain DoubleFloat is intended to make available hardware floating point arithmetic in FriCAS. The actual model of floating point that DoubleFloat provides is system-dependent. For example, on the IBM system 370 FriCAS uses IBM double precision which has fourteen hexadecimal digits of precision or roughly sixteen decimal digits. Arbitrary precision floats allow the user to specify the precision at which arithmetic operations are computed. Although this is an attractive facility, it comes at a cost. Arbitrary-precision floating-point arithmetic typically takes twenty to two hundred times more time than hardware floating point. --------------------- Introduction to Float --------------------- Scientific notation is supported for input and output of floating point numbers. A floating point number is written as a string of digits containing a decimal point optionally followed by the letter "E", and then the exponent. We begin by doing some calculations using arbitrary precision floats. The default precision is twenty decimal digits. :: 1.234 1.234 Type: Float A decimal base for the exponent is assumed, so the number 1.234E2 denotes 1.234x10^2. :: 1.234E2 123.4 Type: Float The normal arithmetic operations are available for floating point numbers. :: sqrt(1.2 + 2.3 / 3.4 ** 4.5) 1.0996972790 671286226 Type: Float -------------------- Conversion Functions -------------------- You can use conversion to go back and forth between Integer, Fraction Integer and Float, as appropriate. :: i := 3 :: Float 3.0 Type: Float i :: Integer 3 Type: Integer i :: Fraction Integer 3 Type: Fraction Integer Since you are explicitly asking for a conversion, you must take responsibility for any loss of exactness. :: r := 3/7 :: Float 0.4285714285 7142857143 Type: Float r :: Fraction Integer 3 - 7 Type: Fraction Integer This conversion cannot be performed: use truncate or round if that is what you intend. :: r :: Integer Cannot convert from type Float to Integer for value 0.4285714285 7142857143 The operations truncate and round truncate :: truncate 3.6 3.0 Type: Float and round to the nearest integral Float respectively. :: round 3.6 4.0 Type: Float truncate(-3.6) - 3.0 Type: Float round(-3.6) - 4.0 Type: Float The operation fractionPart computes the fractional part of x, that is, x - truncate x. :: fractionPart 3.6 0.6 Type: Float The operation digits allows the user to set the precision. It returns the previous value it was using. :: digits 40 20 Type: PositiveInteger sqrt 0.2 0.4472135954 9995793928 1834733746 2552470881 Type: Float pi()$Float 3.1415926535 8979323846 2643383279 502884197 Type: Float The precision is only limited by the computer memory available. Calculations at 500 or more digits of precision are not difficult. :: digits 500 40 Type: PositiveInteger pi()$Float 3.1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 592307816 4 0628620899 8628034825 3421170679 8214808651 3282306647 0938446095 505822317 2 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196 442881097 5 6659334461 2847564823 3786783165 2712019091 4564856692 3460348610 454326648 2 1339360726 0249141273 7245870066 0631558817 4881520920 9628292540 917153643 6 7892590360 0113305305 4882046652 1384146951 9415116094 3305727036 575959195 3 0921861173 8193261179 3105118548 0744623799 6274956735 1885752724 891227938 1 830119491 Type: Float Reset digits to its default value. :: digits 20 500 Type: PositiveInteger Numbers of type Float are represented as a record of two integers, namely, the mantissa and the exponent where the base of the exponent is binary. That is, the floating point number (m,e) represents the number m x 2^e. A consequence of using a binary base is that decimal numbers can not, in general, be represented exactly. ---------------- Output Functions ---------------- A number of operations exist for specifying how numbers of type Float are to be displayed. By default, spaces are inserted every ten digits in the output for readability. Note that you cannot include spaces in the input form of a floating point number, though you can use underscores. Output spacing can be modified with the outputSpacing operation. This inserts no spaces and then displays the value of x. :: outputSpacing 0; x := sqrt 0.2 0.44721359549995793928 Type: Float Issue this to have the spaces inserted every 5 digits. :: outputSpacing 5; x 0.44721 35954 99957 93928 Type: Float By default, the system displays floats in either fixed format or scientific format, depending on the magnitude of the number. :: y := x/10**10 0.44721 35954 99957 93928 E -10 Type: Float A particular format may be requested with the operations outputFloating and outputFixed. :: outputFloating(); x 0.44721 35954 99957 93928 E 0 Type: Float outputFixed(); y 0.00000 00000 44721 35954 99957 93928 Type: Float Additionally, you can ask for n digits to be displayed after the decimal point. :: outputFloating 2; y 0.45 E -10 Type: Float outputFixed 2; x 0.45 Type: Float This resets the output printing to the default behavior. :: outputGeneral() Type: Void ---------------------------------------- Example: Determinant of a Hilbert Matrix ---------------------------------------- Consider the problem of computing the determinant of a 10 by 10 Hilbert matrix. The (i,j)-th entry of a Hilbert matrix is given by 1/(i+j+1). First do the computation using rational numbers to obtain the exact result. :: a: Matrix Fraction Integer:=matrix[ [1/(i+j+1) for j in 0..9] for i in 0..9] + 1 1 1 1 1 1 1 1 1+ |1 - - - - - - - - --| | 2 3 4 5 6 7 8 9 10| | | |1 1 1 1 1 1 1 1 1 1| |- - - - - - - - -- --| |2 3 4 5 6 7 8 9 10 11| | | |1 1 1 1 1 1 1 1 1 1| |- - - - - - - -- -- --| |3 4 5 6 7 8 9 10 11 12| | | |1 1 1 1 1 1 1 1 1 1| |- - - - - - -- -- -- --| |4 5 6 7 8 9 10 11 12 13| | | |1 1 1 1 1 1 1 1 1 1| |- - - - - -- -- -- -- --| |5 6 7 8 9 10 11 12 13 14| | | |1 1 1 1 1 1 1 1 1 1| |- - - - -- -- -- -- -- --| |6 7 8 9 10 11 12 13 14 15| | | |1 1 1 1 1 1 1 1 1 1| |- - - -- -- -- -- -- -- --| |7 8 9 10 11 12 13 14 15 16| | | |1 1 1 1 1 1 1 1 1 1| |- - -- -- -- -- -- -- -- --| |8 9 10 11 12 13 14 15 16 17| | | |1 1 1 1 1 1 1 1 1 1| |- -- -- -- -- -- -- -- -- --| |9 10 11 12 13 14 15 16 17 18| | | | 1 1 1 1 1 1 1 1 1 1| |-- -- -- -- -- -- -- -- -- --| +10 11 12 13 14 15 16 17 18 19+ Type: Matrix Fraction Integer This version of determinant uses Gaussian elimination. :: d:= determinant a 1 ----------------------------------------------------- 46206893947914691316295628839036278726983680000000000 Type: Fraction Integer d :: Float 0.21641 79226 43149 18691 E -52 Type: Float Now use hardware floats. Note that a semicolon (;) is used to prevent the display of the matrix. :: b: Matrix DoubleFloat:=matrix[ [1/(i+j+1\$DoubleFloat) for j in 0..9] for i in 0..9]; Type: Matrix DoubleFloat The result given by hardware floats is correct only to four significant digits of precision. In the jargon of numerical analysis, the Hilbert matrix is said to be *ill-conditioned*. :: determinant b 2.1643677945721411E-53 Type: DoubleFloat Now repeat the computation at a higher precision using Float. :: digits 40 20 Type: PositiveInteger c: Matrix Float := matrix [ [1/(i+j+1\$Float) for j in 0..9] for i in 0..9]; Type: Matrix Float determinant c 0.21641 79226 43149 18690 60594 98362 26174 36159 E -52 Type: Float Reset digits to its default value :: digits 20 40 Type: PositiveInteger See Also: * ``)help DoubleFloat`` * ``)show Float``