Best implementations of quaternary adders
Abstract
The implementation of a quaternary 1-digit adder composed of a 2-bit binary adder, quaternary to binary decoders and binary to quaternary encoders is compared with several recent implementations of quaternary adders. This simple implementation outperforms all other implementations using only one power supply. It is equivalent to the best other implementation using three power supplies. The best quaternary adder using a 2-bit binary adder, the interface circuits between quaternary and binary levels are just overhead compared to the binary adder. This result shows that the quaternary approach for adders use more transistors, more chip area and more power dissipation than the corresponding binary ones.
I Introduction
Many designs of quaternary adders have been proposed in the recent years. Most of these papers are based on simulations using parameters of CNTFET technology. The recent most significant ones are [1][2] [3].:
-
•
[1] only uses one power supply.
-
•
The quaternary half adder presented in [2] uses 3 power supplies, even if the technique used to get the intermediate power supplies is not precised.
-
•
[3] presents both single-supply and 3 supplies versions.
In this paper, we propose a new design of quaternary adders using the same assumptions as in these three papers. This design leads to the most efficient implementation in term of transistor count.
II Methodology
II-A Why CNTFET technology?
This technology uses field-effect transistors that use a single carbon nanotube or an array of carbon nanotubes as the channel material instead of bulk silicon in the traditional MOSFETs. The MOSFET-like CNTFETs having p and n types look the most promising ones. The technology has advantages and drawbacks:
-
•
CNTFETs have variable threshold voltages (according to the inverse function of the diameter). This is a big advantage compared to CMOS for which different masks are needed to get different threshold voltages.
-
•
Among advantages, high electron mobility, high current density, high tranductance can be quoted.
-
•
Lifetime issues, reliability issues, difficulties in mass production and production costs are quoted as disadvantages.
-
•
CNTFET technology is far from being a mature one. In 2019, a 16-bit RISC microprocessor has been built with 14,000 CNFET transistors [4]. While this is an advance for CNTFET technology, we may observe that the Intel 8086 CPU, which was a 16-bit microprocessor, has been launched in 1978 with 29,000 transistors, more than 40 years ago!
However, as CMOS circuits and CNTFET ones have basically the same circuit styles, CNTFETs can be used to propose a new implementation of quaternary adders and compare it with previous published proposals.
II-B Comparing different implementations of quaternary adders
The transistor count is used to compare different implementations of quaternary adders. As comparisons are done by using the same technology and the same operators, the transistor count is significant as it is very doubtful that more transistors could lead to:
-
•
less interconnects
-
•
reduced chip area
-
•
reduced power dissipation
-
•
reduced propagation delays
-
•
Etc.
III Quaternary circuits
III-A Four different levels
While binary circuits have 0 and 1 levels, quaternary circuits have four levels 0 < 1 < 2 < 3. The corresponding levels could be voltage, current or charge levels.
-
•
Charge levels. This approach is used in flash memories. 4-valued (MLC) flash memories store two bits per cell. 8-valued (TLC) memories store 3 bits per cell. In 2018, ADATA, Intel, Micron, and Samsung have launched some SSD products using QLD NAND-memory with 4 bits per cell. While binary flash memories have the advantage of faster write speeds, lower power consumption and higher cell endurance, M-valued flash memories provide higher data density and lower costs. But charges are not suitable for combinational circuits
-
•
Current levels. Current levels have been used, but are no longer suitable because of the static power dissipation. Power dissipation is the main issue in to-day integrated circuits.
-
•
Voltage levels. This is the only practical approach to design combinational circuits.
III-B Three or one power supplies
The first approach to get four different voltage levels is to use three power supplies: , and . Fig. 1 presents a possible implementation using transmission gates. true and complementary control inputs are used to transmit to the output one of the four voltage levels corresponding to 0, 1, 2 and 3. The 3 power supplies version of [3] uses the same scheme. This approach drawback is to use three voltage supplies instead of one in the binary case. The second approach uses only one power supply for levels 0 and 3 and generates levels 1 and 2 through resistor-like dividers. Fig. 2 shows a first implementation. There are four several pathes: only one should be active to get each output value. Transistors T1, T2, T5, T6 are always on (resistor behavior). The inputs of the other transistors should be fixed to get these transistors on or off.
-
•
Level 0 : T9 on ; T0, T3, T4, T7 and T8 off
-
•
Level 1 : T0 and T3 on ; T4, T7, T8 and T9 off
-
•
Level 2 : T4 and T7 on ; T0, T3, T8 and T9 off
-
•
Level 3 : T8 on ; T0, T3, T4, T7 and T9 off
Fig. 3 presents a variant of the previous one. Only one path with resistor-like transistors is used with two resistor-connected p and two resistor-connected n transistors. T6 is used to bypass T1 and T7 is used to bypass T4.
-
•
Level 0 : T9 on ; T0, T5, T6, T7 and T8 off
-
•
Level 1 : T0 and T7 on ; T5, T6, T8 and T9 off
-
•
Level 2 : T5 and T6 on ; T0, T7, T8 and T9 off
-
•
Level 3 : T8 on ; T0, T5, T6, T7 and T9 off
Both circuits are similar with 10 transistors. This approach has two drawbacks. Levels 1 and 2 generates static power dissipation. The resistors in pathes 1 and 2 increase the RC loads and degrade switching times compared to pathes 0 and 3.
III-C Encoder and decoder circuits
The encoder circuits can be derived from the circuits presented in Fig. 1, Fig. 2 and Fig. 3. The decoder circuits are easy to implement. They correspond to Table I in which binary values are 0 and 3. NQI, IQI and PQI outputs are provided by 3 inverters having 3 different threshold levels. Fig. 4 shows the corresponding circuits presented in [1]. The situation is similar whether circuits use 3 or 1 power supplies. Appropriate threshold levels are got by defining the chiral number of each transistor used in the inverter.
IN | NQI | IQI | PQI |
---|---|---|---|
0 | 3 | 3 | 3 |
1 | 0 | 3 | 3 |
2 | 0 | 0 | 3 |
3 | 0 | 0 | 0 |
IV How to implement a quaternary adder
Table II shows the truth table of a 1-digit quaternary adder. There are different techniques to implement a quaternary 1-digit adder:
-
•
The simplest way is to use a 2-bit binary adder and to interface it with a 4-to-2 decoder and a 2-to-4 encoder. The corresponding adder is presented in section V.
-
•
The opposite approach is the direct implementation of Table II by using the general approach. A function f(inputs) is decompose into f(inputs) = 3.f3 + 2.f2 + 1.f1 where f3, f2 and f1 are respectively the binary functions of the inputs for which the functions have values 3, 2 and 1. f3, f2 and f1 includes the NQI, IQI and PQI functions of input variables (Table I). This approach is used in the adder presented in section VI.
- •
From the 1-digit quaternary adder, N-digit quaternary carry propagate (CPA), carry lookead (CLA) and carry save (CSA) adders can be easily derived.
A | B | Ci | QS | QC | A | B | Ci | QS | QC | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | |
0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 2 | 0 | |
0 | 2 | 0 | 2 | 0 | 0 | 2 | 1 | 3 | 0 | |
0 | 3 | 0 | 3 | 0 | 0 | 3 | 1 | 0 | 1 | |
1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 2 | 0 | |
1 | 1 | 0 | 2 | 0 | 1 | 1 | 1 | 3 | 0 | |
1 | 2 | 0 | 3 | 0 | 1 | 2 | 1 | 0 | 1 | |
1 | 3 | 0 | 0 | 1 | 1 | 3 | 1 | 1 | 1 | |
2 | 0 | 0 | 2 | 0 | 2 | 0 | 1 | 3 | 0 | |
2 | 1 | 0 | 3 | 0 | 2 | 1 | 1 | 0 | 1 | |
2 | 2 | 0 | 0 | 1 | 2 | 2 | 1 | 1 | 1 | |
2 | 3 | 0 | 1 | 1 | 2 | 3 | 0 | 2 | 1 | |
3 | 0 | 0 | 3 | 0 | 3 | 0 | 1 | 0 | 1 | |
3 | 1 | 0 | 0 | 1 | 3 | 1 | 1 | 1 | 1 | |
3 | 2 | 0 | 1 | 1 | 3 | 2 | 1 | 2 | 1 | |
3 | 3 | 0 | 3 | 1 | 3 | 3 | 1 | 3 | 1 |
V Quaternary adders with quaternary to binary interfaces
The simpliest way to implement a quaternary adder is to interface a 2-bit binary adder with quaternary to binary decoder and encoder circuits. Table III presents the truth table of the quaternary to binary conversion. Binary values are 0 and 3.
V-A 4 to 2 decoder circuit
The decoder circuit is presented in Fig 5. The circuitry is the same using 3 or 1 voltage levels. It is based on the inverters 1, 2 and 3 with the different threshold levels (such as the inverters presented in Fig. 4) followed by usual binary gates. The number of transistors depends on the implementation of the XOR gate. It ranges from 16 T when using 4 Nand gates down to 3 T as proposed in [5] (Fig.6). An acceptable value is 9 T, which corresponds to a conventional CMOS implementation used in [6]. This implementation doesn’t use pass transistors and has a full swing output. The overall transistor count for the decoder ranges from 28 T (most conservative implementation) down to 15T with 21 T as an acceptable value.
Q | NQI | IQI | PQI | X1 | X0 |
---|---|---|---|---|---|
0 | 3 | 3 | 3 | 0 | 0 |
1 | 0 | 3 | 3 | 0 | 3 |
2 | 0 | 0 | 3 | 3 | 0 |
3 | 0 | 0 | 0 | 3 | 3 |
V-B 2 to 4 encoder circuits
The binary to quaternary encoder circuits depend on the technique that is used to generate the four output values.
V-B1 Encoder of Fig. 1
The encoder circuit corresponding this approach is shown in Fig. 7. It uses 16 T.
V-B2 Encoder of Fig. 2
The inputs of transistors T0, T3, T4, T7, T8 and T9 should be controled. p transistors are on when the input is 0 and n transistors are on when the input is 1. The corresponding truth table is shown in Table IV. The corresponding equations are
-
•
-
•
-
•
-
•
-
•
-
•
4 NOT gates are needed (, IT3 and IT4), together with 3 Nand and 1 Nor gates to control the inputs. The total transistor count is 8 (NOT) + 16 (Nand and Nor) + 10 (Fig. 2) = 34 T.
X1 | X0 | IT0 | IT3 | IT4 | IT7 | IT8 | IT9 |
---|---|---|---|---|---|---|---|
0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 |
1 | 0 | 0 | 1 | 1 | 1 | 1 | 0 |
1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 |
V-B3 Encoder of Fig. 3
The inputs of transistors T0, T5, T6, T7, T8 and T9 should be controled. p transistors are on when the input is 0 and n transistors are on when the input is 1. The corresponding truth table is shown in Table V. The corresponding equations are
-
•
-
•
-
•
-
•
-
•
-
•
4 NOT gates are needed (, IT6 and IT7), together with 2 Nand and 2 Nor gates to control the inputs. The total transistor count is 8 (NOT) + 16 (Nand and Nor) + 10 (Fig. 3) = 34 T.
X1 | X0 | IT0 | IT5 | IT6 | IT7 | IT8 | IT9 |
---|---|---|---|---|---|---|---|
0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 |
1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 |
1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 |
V-B4 Transistor count for encoder and decoder circuits for quaternary to binary interfaces
The transistor count is
- •
- •
V-C 1-digit quaternary adder using a binary adder
There are many different ways to implement binary adders. They differ on the use or not of transmission gates. It is out of the scope of this paper to present all the possible implementations. Fig. 8 presents two typical implementations of a full adder. The left part only uses Nand gates. The right part uses Xor and Nand gates. A CNTFET 8 T full adder (Fig. 9) has been presented [5]. This adder doesn’t restore levels and using it could raise issues, both for noise margins and switching times due to series of pass transistors. The transistor counts are respectively 36 T, 18 T and 8 T. The quaternary adder uses two binary adders, one encoder and one decoder circuits. Using 2-bit carry propagate adders, the overall transistor count for the 3 power supplies version is thus:
-
•
72 + 44 = 116 T without using pass transistors
-
•
36 + 31 = 67 T when using pass transistors for Xor gates
-
•
16 + 31 = 47 T when using pass transistors for Xor gates and the 8T binary adder (Fig. 9)
The single-supply version would use more transistors (+ 18 T).
V-D N-digit quaternary adders
Using quaternary interfaces and 2N-bit adders, N-digit quaternary adders can be implemented. CPAs, CLAs and CSAs implementations are discussed in section VIII.
VI Quaternary adders presented in [1]
These adders are based on the following approach: where fi(inputs) is the binary function for which Qs = i. Any input must be decomposed according to Table VI. The corresponding circuit is shown in Fig. 10. It uses 18 T.
For the half adder, according to the left part of Table II, the equations are
The half adder circuit is presented in Fig. 11. With 2 input decoders, the sum circuit and the carry circuit, the transistor count is 87 T. The corresponding full adder presented in [1] has a quaternary carry input. While this could be useful for designing compressors used in multiplier reduction trees, it is useless for usual N-digit adder in which carry input and output have binary values. In Fig. 12, we present a modified version in which binary carries are used. The half adder implements the H function, defined as H = (A+B) mod 4. A modified half adder implements Sum = H + C. It has the decoded values of quaternary input H (provided by the Q-dec shown in Fig. 10) and the binary carry input. The corresponding scheme is shown in Fig. 13. With one Q-Dec (H), one NQI inverter + one binary inverter to generate C0 and C1, it has 8 T + 4 T + 28 T = 40 T while the sum part of the quaternary half-adder has 52 T. The carry generator circuit is based on the following observations:
-
•
-
•
Cout=0 iff A+B+Cin<4 and Cout = 1 iff A+B+Cin>3
-
•
Cout=0 if (Cin= 0 and A+B<4) or (Cin=1 and A+B < 3)
-
•
The correspondance between H = (A+B) mod. 4 and A+B is given in Table VII
-
•
From Table VII,
The corresponding carry generator circuit is shown in Fig. 14.
The complete modified quaternary adder has 52 T (sum part of QHA) + 40 T (sum part of modified QHA) + 19 T (carry circuit) = 111 T. This number is minimal, as the minimal number of Q-DEC is used, assuming that there are no fan-out or routing issues.
I | I0 | I1 | Ii | I2 | I3 | ||
---|---|---|---|---|---|---|---|
0 | 3 | 0 | 3 | 3 | 3 | 0 | 3 |
1 | 0 | 3 | 0 | 3 | 3 | 0 | 3 |
2 | 0 | 0 | 3 | 0 | 0 | 3 | 3 |
3 | 0 | 0 | 3 | 0 | 3 | 0 | 0 |
Cin | A+B | H | Cout | Cin | A+B | H | Cout | |
---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | |
0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | |
0 | 2 | 2 | 0 | 1 | 2 | 2 | 0 | |
0 | 3 | 3 | 0 | 1 | 3 | 3 | 1 | |
0 | 4 | 0 | 1 | 1 | 4 | 0 | 1 | |
0 | 5 | 1 | 1 | 1 | 5 | 1 | 1 | |
0 | 6 | 2 | 1 | 1 | 6 | 2 | 1 |
VII MUX based quaternary adders
The MUX based implementation is based on the observation of the quaternary half adder truth table (left part of Table II when Ci = 0). When A = 0 then QS = B. When A=1 then QS = (B+1) mod 4 (successor B). When A=2, QS = (B+2) mod 4 (2nd level successor B). When B=3 then QS = (B-1) mod. 4 (predecessor B).
VII-A Quaternary adder derived from [2]
The quaternary half adder presented in [2] uses the decoder circuits and the muxes presented in Fig. 15. The QTG circuits are used to implement the successor and predecessor functions. Transistor counts for QDEC and QMUX are both 16 T. The half adder based on QDEC and QMUX is presented in Fig.16. The transistor counts are
-
•
For QS, there are 4 QTGs and 2 QDECs for a total of 16 T*6 = 96 T.
-
•
For QCarry, there are 6 inverters, 6 transistors and 1 QTG for a total of 12 + 6 + 16 = 32 T.
-
•
The half adder has 128 T.
The corresponding full adder is not presented in [2]. However, the full adder can be easily derived. The half adder (Fig. 16) corresponds to C=0. To compute QSUM1 corresponding to C=1, only two more QTGs are needed. The final sum is derived from QSUM0 and QSUM1 by using two transmission gates and one inverter. A similar technique is used to compute the carry output, as shown in Fig. 18. Only one more QTG and two transmission gates are needed. The overall transistor count for the full adder is 96 + 32 + 6 + 16 + 4 = 154 T.
VII-B Quaternary adders presented in [3]
These adders also use MUXes, but implement the successor, second level successor and predecessor circuits as separate blocks. Basically, the half adder presented in Fig. 19 is similar to the half adder of Fig. 16. The corresponding full adder, presented in Fig. 20, also use the same approach than the full adder of Fig. 17 and Fig. 18. Two versions are presented, with one and three power supplies. The different components are
-
•
QMUX 4:1 is shown in Fig. 21. It has 12 T.
-
•
QMUX (not shown) is simplier with only 6T.
-
•
The successor circuit with 3 power supplies is shown in Fig. 22. It has 6 T. The second level successor predecessor circuits (not shown) have also 6 T. The transistor count for the 3 circuits is 18 T.
-
•
The successor circuit with 1 power supply is shown in Fig. 23. It has 13 T. The second level successor and the predecessor circuits (not shown) have respectively 12 T and 17 T. The transistor count for the 3 circuits is 42 T.
-
•
Inverters are needed for , , .
-
–
3 power supplies: If the B inverters drive the different subblocks, the fan-out are respectively 10, 6 and 8. Only 3 inverters (6 T) are needed, but there could be fan-out and routing issues. If different B inverters are used for each subblocks, there are 12 inverters (24 T).
-
–
1 power supply: If the B inverters drives the different subblocks, the fan-out are respectively 11, 9 and 11. There are 3 inverters (6 T). With different inverters for each subblock, there are 12 inverters (24 T).
-
–
The overall transistor count is given in Table VIII. Obviously, the 3 power supplies version is more efficient than the version presented in [2]: customizing the implementation of the successor and predecessor functions reduces the transistor count versus using 4-valued MUXes. The 1-power supply version has far more transistors.
S-HA | C-HA | SFA | CFA | Inverters | Total | |
---|---|---|---|---|---|---|
3 supplies | 30 | 14 | 12 | 20 | 6/24 | 82/100 |
1 supply | 54 | 14 | 36 | 20 | 6/24 | 130/148 |
VIII Carry Look Ahead and Carry Skip Adders
We now compare the carry computation for a 8-bit and 4-digit CLA and CSA adders. The binary computation is decomposed in two 4-bit blocks. The quaternary computation only uses one block.
VIII-A Carry-Look Ahead Adders
Fig. 24 presents a 4-bit carry look-ahead adder. The binary equations of the carry computation part are well-known:
(or
Binary Gi and Pi functions are implemented respectively by Nand + Inverter and Nor + inverter. Both functions use 6 T.
The optimal implementation of C1, C2, C3 and C4 uses a complex gate + one inverter. The transistor count for a 4-bit carry computation is given in Table IX.
For quaternary adders, the binary G and P functions for any bit j are:
According to Table VI, the equations can be reformulated as
where A0 and B0 are the outputs of NQI inverters, Ai and BI are the outputs of IQI inverters, A3 and B3 are the outputs of PQI inverters and A1, A2, B1 and B2 are the outputs of the circuit shown in Fig. 10.
Assuming that all these values are available, the transistor count is 12 T for G and 16 T for P.
For 4 digits, the equations are similar with different implementations of Gi and Pi functions. The transistor count for a 4-digit carry computation is given in Table X.
Function | Gi | Pi | C1 | C2 | C3 | C4 | 4-bit | 8-bit |
---|---|---|---|---|---|---|---|---|
T. count | 24 | 24 | 8 | 12 | 16 | 20 | 104 | 208 |
Function | Gi | Pi | C1 | C2 | C3 | C4 | 4 quaternary digits |
---|---|---|---|---|---|---|---|
T. count | 48 | 64 | 8 | 12 | 26 | 20 | 168 |
The transistor count is better for the carry computation of quaternary adders versus binary ones. The increase cost of Gi and Pi implementation is compensated by the reduced number of logical levels.
VIII-B Carry-Skip Adders
For an 8-bit CSA, the binary carry computation is composed of two 4-bit skip computations. For 4-bit, it means P1 to P4 functions, a 4-input And gate and a multiplexer. For a 4-digit CSA, the carry computation uses the same number of functions with the only difference in the implemention of Pi. The transistor counts are given in Table XI.
Pi | Nand+inverter | Mux | 4-bit CS | 8-bit 4-digit CS | |
---|---|---|---|---|---|
B | 24 | 10 | 14 | 48 | 96 |
Q | 64 | 10 | 14 | 88 |
IX Comparing the different quaternary adders with binary adders
IX-A 1-digit quaternary adder versus 2-bit binary adder
Table XII summarizes the transistor count for the different quaternary adders:
-
•
QB adder corresponds to the binary implementation with binary to quaternary interfaces (section V). The different values correspond to the different ways to implement a binary full adder. The middle value is probably the most significant.
- •
- •
- •
With one power supply, interfacing a 2-bit adder with quaternary to binary interface is the best implementation. With 3 power supplies, there is no significant difference with the best MUX quaternary implementation. In both cases, the different quaternary adders have x2 or x3 the transistor count of a typical 2-bit binary adder.
IX-B 4-digit quaternary adders versus 8-bit binary adders
Table XIII and Table XIV summarize the transistor count for the different implementations of 4-digit quaternary adders to be compared with a 8-bit binary adder. Within these tables,
-
•
First column is the adder type.
-
•
Second column is the quaternary adders built from a 8-bit binary adder with 4-to-2 decoders and 2-4 encoders. The three values correspond to 1) implementation without pass transistor, 2) a conventional implementation with pass transistors and 3) a debatable option where the Xor implementation could raise noise and switching issues. The second value is the most trustable one.
-
•
Third column in Table XIII corresponds to the straigthforward implementation according to the quaternary functions using 1 power supply.
-
•
Fourth column in Table XIV corresponds to quaternary adders (3 power supplies) using Muxes.
-
•
Fifth column corresponds to implementations with Muxes and customized successor and predecessor circuits.
-
•
The last column presents the transistor count for the binary implementation. While this implementation only uses one power supply, it is included to Table XIV for the comparisons.
QB adders | [1] adder | [2] adder | [3] adder | 8-bit adder | |
---|---|---|---|---|---|
CPA | 536/340/260 | 444 | 592/520 | 288/144/64 | |
CLA | 784/588/508 | 612 | 760/688 | 496/352/272 | |
CSA | 632/436/356 | 532 | 680/608 | 384/240/160 |
QB adders | [1] adder | [2] adder | [3] adder | 8-bit adder | |
---|---|---|---|---|---|
CPA | 464/268/188 | 616 | 400/328 | 288/144/64 | |
CLA | 672/476/396 | 784 | 568/496 | 496/352/272 | |
CSA | 560/436/284 | 704 | 488/416 | 384/240/160 |
Some significant results can be derived from Table XIII and Table XIV.
-
•
With only one power supply, the direct interfacing of a binary adder with 4-2 decoders and 2-4 encoders is the best implementation with the smallest transistor count.
-
•
With three power supplies, only the implementation proposed in [3] can compete with the interfacing of binary adders. We can notice than the transistor count for this implementation is optimistic as it implies that the minimal number of NQI, IQI and PQI inverters can be used without fan-out and connection issues. All the other implementations are outperformed by the direct interfacing of binary adders.
-
•
Obviously, the best quaternary adder is outperformed by the binary adder computing the same amount of information. This binary adder is included in the best quaternary adder, while the interfacing decoder and encoder circuits are a significant overhead.
Quaternary adders are specific combinational circuits. They have some drawbacks. Either they use three power supplies instead of one for binary circuits, or they exhibit static power dissipation and degraded switching times when using only one power supply. However, the main point is that the best implementation of quaternary adders consists in interfacing binary adders with 4 to 2 decoder and 2 to 4 encoder circuits. It means that there is no advantage to try to directly implement quaternary combinational functions. To summarize, the best quaternary adder with N digits is the corresponding 2N binary adder with a significant overhead: decoder and encoder circuits.
X Concluding remarks
Most presented implementations of ternary or quaternary circuits claim advantages of multiple valued circuits. The following quote summarizes the arguments that may be found in most MVL papers : “MVL circuits have potential advantages. Using MVL circuits reduces the complexity of interconnection via reducing the number of wires since each wire carries more than one digit of data. Power consumption and area of the MVL circuits are generally less than the corresponding binary circuits due to the reduction in number of active elements [8].
How does our results fit with these claims ? It is obvious that a N digit quaternary adder has less input and output digits than a 2N bit binary adder. But we have shown that the best N-digit quaternary adder includes the corresponding 2N bit binary adder with the overhead of input decoder and output encoder circuits. According to Table XIII and Table XIV, the best 4-digit quaternary adder has more than 2.5x the transistor count of 8-bit binary adders. These transistor must be interconnected: it means that the quaternary adders have far more connections than the binary adders as soon as the internal connections are considered. As a matter of facts, is there an “interconnection wall" in digital circuits as the well-known “power wall" and a “memory wall"?. The answer is no, even in there could be interconnection isssues in circuits such as FPGAs. While the up-to-date CMOS technological nodes are more and more costly, they have more and more interconnection layers. Twenty years ago, the 180 nm node had 6 metal layers. To-day, the number of metal layers in nano-CMOS technologies usually ranges from 8 to 15, with a trade-off between integration and cost.
It is difficult to believe that x2.5 more transistors could lead to a reduction of chip area and power dissipation. More transistors means more chip area and more power dissipation. It turns out that the assumptions of the quote are false, at least for using MVL techniques for combinational circuits such as adders, multipliers, etc.
MVL circuits are confined to a small niche [8] To the best of my knowledge, there are to-day only two significant applications of MVL circuits:
-
•
Reducing the number of interconnects with multiple levels is used in amplitude modulation: for instance, PAM-4 coding [9], that uses 4 levels to code 2 bits is adopted for high-speed data transmission (IEEE802.3bs). PAM-8 and PAM-16 have also been defined
-
•
4-valued (MLC) flash memories store two bits per cell. 8-valued (TLC) memories store 3 bits per cell. However, these M-valued circuits (M=) are used for higher density, not for higher speeds.
Trying to design MVL combinational circuits to compete with binary ones looks like a dead-end.
References
- [1] S.A. Ebrahimi,M.R. Reshadinezhad, A. Bohlooli, M. Shahsavari, “Efficient CNTFET-based design of quaternary logic gates and arithmetic circuits", Microelectronics Journal, pp. 156-166, January 2016
- [2] M.H. Moaiyeri, K. Navi, O. Hashemipour, “Design and Evaluation of CNFET-Based Quaternary Circuits", Circuits Syst Signal Process (2012) 31 pp.1631-1652, DOI 10.1007/s00034-012-9413-2
- [3] E. Roosta and S. A. Hosseiny, “A Novel Multiplexer-Based Quaternary Full Adder in Nanoelectronics", Circuits, Systems and Signal Processing, https://doi.org/10.1007/s00034-019-01039-8
- [4] G. Hills, C. Lau, A. Wright et al. “Modern microprocessor built from complementary carbon nanotube transistors", Nature 572, pp. 595-602 (2019) doi:10.1038/s41586-019-1493-8
- [5] K.K. Nehru, T. Nagarjuna and G. Vijay, “Comparative Analysis of CNTFET and CMOS Logic Based Arithmetic Logic Unit", in Journal of Nano and Electronic Physics 9 (4), January 2017
- [6] vlsitechnology, http://www.vlsitechnology.org/html/cells/vxlib013/xor2.html
- [7] W. Haixia, Z. Shunan, S. Zhentao, Q. Xiaonan, and C. Yueyang, “Design of low-power quaternary flip-flop based on dynamic source-coupled logic,” in Proceedings of 2011 International Conference on Electronics, Communications and Control (ICECC), 2011, pp. 826-828.
- [8] D. Etiemble, “Why M-Valued Circuits are restricted to a Small Niche”, in Journal of Multiple Valued Logic and Soft Computing, Vol. 9, No1, 2003.
- [9] Intel,“ PAM4 Signaling Fundamentals", https://www.intel.com/content/ dam/www/programmable/us/en/pdfs/literature/an/an835.pdf