Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floatingpoint #97

Merged
merged 19 commits into from
Oct 3, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion doc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,14 @@ Some in-development items will have opened issues, as well. Feel free to create
- Sort
- [Bitonic sort](./components/sort.md#bitonic-sort)
- Arithmetic
- [Prefix Trees](./components/parallel_prefix_operations.md)
- [Prefix Trees](./components/parallel_prefix_operations.md) Several efficient components that leverage a variety of parallel prefix trees such as Ripple, Kogge-Stone, Sklansky, and Brent-Kung tree types.
- [Priority Encoder](./components/parallel_prefix_operations.md)
- [Or-scan](./components/parallel_prefix_operations.md)
- [Incrementer](./components/parallel_prefix_operations.md)
- [Decrementer](./components/parallel_prefix_operations.md)
- [Adders](./components/adder.md)
- [Sign Magnitude Adder](./components/adder.md#ripple-carry-adder)
- [Parallel Prefix Adder](./components/parallel_prefix_operations.md)
- Subtractors
- [One's Complement Adder Subtractor](./components/adder.md#ones-complement-adder-subtractor)
- Multipliers
Expand All @@ -48,11 +53,13 @@ Some in-development items will have opened issues, as well. Feel free to create
- Square root
- Inverse square root
- Floating point
- [Floating-Point Value Types](./components/floating_point.md)
- Double (64-bit)
- Float (32-bit)
- BFloat16 (16-bit)
- BFloat8 (8-bit)
- BFloat4 (4-bit)
- [Simple Floating-Point Adder](./componeents/floating_point.md#floatingpointadder)
- Fixed point
- Binary-Coded Decimal (BCD)
- [Rotate](./components/rotate.md)
Expand Down
27 changes: 27 additions & 0 deletions doc/components/floating_point.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Floating-Point Components

Floating-point operations require meticulous precision, and have standards like [IEEE-754](<https://standards.ieee.org/ieee/754/6210/>) which govern them. To support floating-point components, we have created a parallel to `Logic`/`LogicValue` which are part of [ROHD](<https://intel.github.io/rohd-website/>). Here, `FloatingPoint` is the `Logic` wire in a component that carries `FloatingPointValue` literal values. An important distinction is that these classes are parameterized to create arbitrary size floating-point values.

## FloatingPointValue

The `FloatingPointValue` class comprises the sign, exponent, and mantissa `LogicValue`s that represent a floating-point number. `FloatingPointValue`s can be converted to and from Dart native `Double`s, as well as constructed from integer and string representations of their fields. They can be operated on (+, -, *, /) and compared.

The various IEEE constants representing corner cases of the field of floating-point values for a given size of `FloatingPointValue`: infinities, zeros, limits for normal (e.g. mantissa in the range of [1,2]) and sub-normal numbers (zero exponent, and mantissa <1).

Appropriate string representations, comparison operations, and operators are available. The usefulness of `FloatingPointValue` is in the testing of `FloatingPoint` components, where we can leverage the abstraction of a floating-point value type to drive and compare floating-point values operated upon by floating-point components.

As 32-bit single precision and 64-bit double-precision floating-point types are most common, we have `FloatingPoint32Value` and `FloatingPoint64Value` subclasses with direct converters from Dart native Double.

Finally, we have a `FloatingPointValue` random generator for testing purposes, generating valid floating-point types, optionally constrained to normal range (mantissa in [1, 2)).

## FloatingPoint

The `FloatingPoint` type is a `LogicStructure` which comprises the `Logic` bits for the sign, exponent, and mantissa used in hardware floating-point. These types are provided to simplify and abstract the declaration and manipulation of floating-point types in hardware. This type is parameterized like `FloatingPointValue`, for exponent and mantissa width.

Again, like `FloatingPointValue`, `FloatingPoint64` and `FloatingPoint32` subclasses are provided as these are the most common floating-point number types.

## FloatingPointAdder

A very basic `FloatingPointAdder` component is available which does not perform any rounding. It takes two `FloatingPoint` `LogicStructure`s and adds them, returning a normalized `FloatingPointValue` on the output. An option on input is the type of `ParallelPrefixTree` used in the internal addition of the mantissas.
desmonddak marked this conversation as resolved.
Show resolved Hide resolved

Currently, the `FloatingPointAdder` is close in accuracy (as it has no rounding) and is not optimized for circuit performance, but only provides the key functionalities of alignment, addition, and normalization. Still, this component is a starting point for more realistic floating-point components that leverage the logical `FloatingPoint` and literal `FloatingPointValue` type abstractions.
1 change: 1 addition & 0 deletions lib/src/arithmetic/arithmetic.dart
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
export 'adder.dart';
export 'carry_save_mutiplier.dart';
export 'divider.dart';
export 'floating_point/floating_point.dart';
export 'multiplier.dart';
export 'multiplier_lib.dart';
export 'ones_complement_adder.dart';
Expand Down
112 changes: 112 additions & 0 deletions lib/src/arithmetic/arithmetic_utils.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
// Copyright (C) 2024 Intel Corporation
// SPDX-License-Identifier: BSD-3-Clause
//
// floating_point_test.dart
// Tests of Floating Point stuff
//
// 2024 August 30
// Author: Desmond A Kirkpatrick <[email protected]

// ignore_for_file: avoid_print

import 'package:rohd/rohd.dart';
import 'package:rohd_hcl/rohd_hcl.dart';

/// Helper evaluation methods for printing aligned arithmetic bitvectors.
extension NumericVector on LogicValue {
/// Print aligned bitvector with an optional header.
/// [name] is printed at the LHS of the line, trimmed by [prefix].
/// [prefix] is the distance from the margin bebore the vector is printed.
/// You can align with longer bitvectors by stating the length [align].
/// [lowLimit] will trim the vector below this bit position.
/// You can insert a separator [sepChar] at position [sep].
/// A header can be printed by setting [header] to true.
/// Markdown format can be produced by setting [markDown] to true.
desmonddak marked this conversation as resolved.
Show resolved Hide resolved
String vecString(String name,
{int prefix = 10,
int? align,
int? sep,
bool header = false,
String sepChar = '*',
int lowLimit = 0,
bool markDown = false}) {
final str = StringBuffer();
// ignore: cascade_invocations
if (header) {
str.write(markDown ? '|Name' : ' ' * prefix);

for (var col = ((align ?? width) - width) + width - 1;
col >= lowLimit;
col--) {
final bits = col > 9 ? 2 : 1;
desmonddak marked this conversation as resolved.
Show resolved Hide resolved
if (sep != null && sep == col) {
str.write(markDown ? '' : ' ' * (2 - bits));
if (col > 10 || col == lowLimit) {
str.write('${markDown ? '|' : ' '}$col$sepChar');
} else {
str.write('${markDown ? '|' : ' '}$col $sepChar');
}
str.write(markDown ? '|' : '');
} else if (sep != null && sep == col + 1) {
if (sep == width) {
str
..write(sepChar)
..write(markDown ? '|' : ' ' * (2 - bits));
}
str.write('$col');
} else {
str
..write(markDown ? '|' : ' ' * (2 - bits))
..write(' $col');
}
}
str.write(markDown ? '|\n' : '\n');
if (markDown) {
str.write(markDown ? '|:--:' : ' ' * prefix);

for (var col = ((align ?? width) - width) + width - 1;
col >= lowLimit;
col--) {
str.write('|:--');
}
str.write('-|\n');
}
}
final String strPrefix;
strPrefix = (name.length <= prefix)
? name.padRight(prefix)
: name.substring(0, prefix);
str
..write(strPrefix)
..write(' ' * ((align ?? width) - width));
for (var col = lowLimit; col < width; col++) {
final pos = width - 1 - col + lowLimit;
final v = this[pos].bitString;
if (sep != null && sep == pos) {
if (markDown) {
str.write('|$v $sepChar');
} else {
str.write(
((pos > 9) | (pos == 0)) ? ' $v$sepChar ' : ' $v $sepChar');
}
} else if (sep != null && sep == pos + 1) {
if (markDown) {
str.write('|');
}
if (sep == width) {
str.write('$sepChar ');
}
str.write(v);
} else {
if (markDown) {
str.write('|');
}
str.write(' $v');
}
}
if (markDown) {
str.write('|');
}
return str.toString();
}
}
69 changes: 69 additions & 0 deletions lib/src/arithmetic/evaluate_partial_product.dart
Original file line number Diff line number Diff line change
Expand Up @@ -104,4 +104,73 @@ extension EvaluateLivePartialProduct on PartialProductGenerator {
}
return str.toString();
}

/// Print out the partial product matrix
String markdown() {
final str = StringBuffer();

final maxW = maxWidth();
// print bit position header
str.write('| R | M | S');
for (var i = maxW - 1; i >= 0; i--) {
str.write('| $i ');
}
str
..write('| bitvector | value|\n')
..write('|:--:' * 3);
for (var i = maxW - 1; i >= 0; i--) {
str.write('|:--:');
}
str.write('|:--: |:--:|\n');
// Partial product matrix: rows of multiplicand multiples shift by
// rowshift[row]
for (var row = 0; row < rows; row++) {
final rowStr = (row < 10) ? '0$row' : '$row';
if (row < encoder.rows) {
final encoding = encoder.getEncoding(row);
if (encoding.multiples.value.isValid) {
final first = encoding.multiples.value.firstOne() ?? -1;
final multiple = first + 1;
str.write('|$rowStr| '
'$multiple| '
'${encoding.sign.value.toInt()}');
} else {
str.write('| | |');
}
} else {
str.write('|$rowStr | |');
}
final entry = partialProducts[row].reversed.toList();
str.write('| ' * (maxW - (entry.length + rowShift[row])));
for (var col = 0; col < entry.length; col++) {
str.write('|${entry[col].value.bitString}');
}
final suffixCnt = rowShift[row];
final value = entry.swizzle().value.zeroExtend(maxW) << suffixCnt;
final intValue = value.isValid ? value.toBigInt() : BigInt.from(-1);
str
..write('| ' * suffixCnt)
..write('| ${value.bitString}')
..write('| ${value.isValid ? intValue : "<invalid>"}'
' (${value.isValid ? intValue.toSigned(maxW) : "<invalid>"})|\n');
}
// Compute and print binary representation from accumulated value
// Later: we will compare with a compression tree result
str.write('||\n');

final sum = LogicValue.ofBigInt(evaluate(), maxW);
// print out the sum as a MSB-first bitvector
str.write('|||');
for (final elem in [for (var i = 0; i < maxW; i++) sum[i]].reversed) {
str.write('|${elem.toInt()} ');
}
final val = evaluate();
str.write('| ${sum.bitString}| '
'${val.toUnsigned(maxW)}');
if (isSignExtended) {
str.write(' ($val)');
}
str.write('|\n');
return str.toString();
}
}
6 changes: 6 additions & 0 deletions lib/src/arithmetic/floating_point/floating_point.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
// Copyright (C) 2024 Intel Corporation
// SPDX-License-Identifier: BSD-3-Clause

export 'floating_point_adder.dart';
export 'floating_point_logic.dart';
export 'floating_point_value.dart';
107 changes: 107 additions & 0 deletions lib/src/arithmetic/floating_point/floating_point_adder.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
// Copyright (C) 2024 Intel Corporation
// SPDX-License-Identifier: BSD-3-Clause
//
// floating_point_test.dart
desmonddak marked this conversation as resolved.
Show resolved Hide resolved
// Tests of Floating Point stuff
//
// 2024 August 30
// Author: Desmond A Kirkpatrick <[email protected]

import 'package:meta/meta.dart';
import 'package:rohd/rohd.dart';
import 'package:rohd_hcl/rohd_hcl.dart';

/// An adder module for FloatingPoint values
class FloatingPointAdder extends Module {
/// Must be greater than 0.
final int exponentWidth;

/// Must be greater than 0.
final int mantissaWidth;

/// Output [FloatingPoint] computed
late final FloatingPoint sum =
FloatingPoint(exponentWidth: exponentWidth, mantissaWidth: mantissaWidth)
..gets(output('sum'));

/// The result of [FloatingPoint] addition
@protected
late final FloatingPoint _sum =
FloatingPoint(exponentWidth: exponentWidth, mantissaWidth: mantissaWidth);

/// Swapping two FloatingPoint structures based on a conditional
static (FloatingPoint, FloatingPoint) _swap(
Logic swap, (FloatingPoint, FloatingPoint) toSwap) =>
(
toSwap.$1.clone()..gets(mux(swap, toSwap.$2, toSwap.$1)),
toSwap.$2.clone()..gets(mux(swap, toSwap.$1, toSwap.$2))
);

/// Add two floating point numbers [a] and [b], returning result in [sum]
FloatingPointAdder(FloatingPoint a, FloatingPoint b,
{ParallelPrefix Function(List<Logic>, Logic Function(Logic, Logic))
ppGen = KoggeStone.new,
super.name})
: exponentWidth = a.exponent.width,
mantissaWidth = a.mantissa.width {
if (b.exponent.width != exponentWidth ||
b.mantissa.width != mantissaWidth) {
throw RohdHclException('FloatingPoint widths must match');
}
a = a.clone()..gets(addInput('a', a, width: a.width));
b = b.clone()..gets(addInput('b', b, width: b.width));
addOutput('sum', width: _sum.width) <= _sum;

// Ensure that the larger number is wired as 'a'
final doSwap = a.exponent.lt(b.exponent) |
(a.exponent.eq(b.exponent) & a.mantissa.lt(b.mantissa)) |
((a.exponent.eq(b.exponent) & a.mantissa.eq(b.mantissa)) & b.sign);

(a, b) = _swap(doSwap, (a, b));

final aExp =
a.exponent + mux(a.isNormal(), a.zeroExponent(), a.oneExponent());
final bExp =
b.exponent + mux(b.isNormal(), b.zeroExponent(), b.oneExponent());

// Align and add mantissas
final expDiff = aExp - bExp;
// print('${expDiff.value.toInt()} exponent diff');
final adder = SignMagnitudeAdder(
a.sign,
[a.isNormal(), a.mantissa].swizzle(),
b.sign,
[b.isNormal(), b.mantissa].swizzle() >>> expDiff,
(a, b) => ParallelPrefixAdder(a, b, ppGen: ppGen));

final sum = adder.sum.slice(adder.sum.width - 2, 0);
final leadOneE =
ParallelPrefixPriorityEncoder(sum.reversed, ppGen: ppGen).out;
final leadOne = leadOneE.zeroExtend(exponentWidth);

// Assemble the output FloatingPoint
_sum.sign <= adder.sign;
Combinational([
If.block([
Iff(adder.sum[-1] & a.sign.eq(b.sign), [
_sum.mantissa < (sum >> 1).slice(mantissaWidth - 1, 0),
_sum.exponent < a.exponent + 1
]),
ElseIf(a.exponent.gt(leadOne) & sum.or(), [
_sum.mantissa < (sum << leadOne).slice(mantissaWidth - 1, 0),
_sum.exponent < a.exponent - leadOne
]),
ElseIf(leadOne.eq(0) & sum.or(), [
_sum.mantissa < (sum << leadOne).slice(mantissaWidth - 1, 0),
_sum.exponent < a.exponent - leadOne + 1
]),
Else([
// subnormal result
_sum.mantissa < sum.slice(mantissaWidth - 1, 0),
_sum.exponent < _sum.zeroExponent()
])
])
]);
// print('final sum: ${_sum.value.bitString}');
}
}
Loading
Loading