Identifying Vulnerabilities in Smart Contracts
using Interval Analysis
Abstract
This paper serves as a progress report on our research, specifically focusing on utilizing interval analysis, an existing static analysis method, for detecting vulnerabilities in smart contracts. We present a selection of motivating examples featuring vulnerable smart contracts and share the results from our experiments conducted with various existing detection tools. Our findings reveal that these tools were unable to detect the vulnerabilities in our examples. To enhance detection capabilities, we implement interval analysis on top of Slither [4], an existing detection tool, and demonstrate its effectiveness in identifying certain vulnerabilities that other tools fail to detect.
1 Introduction
The term “smart contract” was originally used to describe automated legal contracts, whose content cannot be negotiated or changed. Nowadays, the term is most commonly known as programs that are executed by special nodes in a decentralised network or a blockchain. Indeed, the blockchain technology captures the initial meaning of the term: contracts are encoded as an immutable piece of code, and the terms of the contract are predetermined and automatically enforced by the contract itself.
This immutability property also implies more effort on the contract developers side: they have to be very careful about what gets deployed because that code is (1) public and anyone can see it and (2) it cannot be changed/updated as an ordinary program. Ethereum111Vitalik Buterin, A Next-Generation Smart Contract and Decentralized Application Platform, 2014, URL: https://ethereum.org/en/whitepaper/ is a very popular blockchain platform which has been affected by the most significant attacks based on vulnerabilities in the deployed code. For instance, “The DAO attack” 222David Siegel, 2016: Understanding The DAO Attack. URL:https://www.coindesk.com/learn/2016/06/25/understanding-%****␣example.tex␣Line␣50␣****the-dao-attack/ was based on the fact that a smart contract could be interrupted in the middle of its execution and then called again. This is known as the reentrancy vulnerability. An attacker noticed that in a withdrawal function of the smart contract, the transfer of digital assets was performed before updating the balance (i.e., decrementing the balance with the withdrawn amount) of a contract party. The attacker first deposited cryptocurrency into the smart contract. Then, by creating a scenario where the withdrawal function called itself just before updating its own balance, the attacker managed to drain the funds of the smart contract as long as its balance exceeded the amount withdrawn.
Such mistakes are unfortunate and researchers and practitioners started to propose methods and tools for detecting them. For example, Slither [4] is an easy to use static analysis tool for smart contracts written in Solidity 333Solidity, version 0.8.20: https://docs.soliditylang.org/en/v0.8.20/control-structures.html; Mythril444Mythril docs: https://mythril-classic.readthedocs.io/en/develop/ is a security analysis tool for EVM bytecode based on symbolic execution; Solhint555Solhint official website: https://protofire.github.io/solhint/ is a linter for Solidity code. The list of tools is long and it was explored in various papers (e.g., [6, 8, 7, 2, 5]). These tools are indeed very useful, but in the same time they are not perfect and they can fail to detect problematic situations in smart contracts code.
In this paper we provide several examples of smart contracts which contain vulnerabilities and we find that vulnerability detection tools are not as precise as we expect, and they fail to detect vulnerabilities in our examples. We attempt to enhance one of them (Slither) with an existing static analysis method called interval analysis. This method allows us to better approximate the values interval for each program variable. Based on the experiments that we performed, interval analysis proves to be very useful in detecting problematic situations in smart contracts. For example, integer division in Solidity ignores the reminder. In a situation where an amount of cryptocurrency must be divided and transferred to a number of recipients, a division where the remainder is ignored could lead to funds that remain locked in the smart contract. Another example is related to uninitialised variables: such variables are initialised with default values and it may be the case that the default value is not suitable for the purpose of that variable.
By keeping track of all the possible values for each program variable, interval analysis allows us to signal such situations in smart contracts.
Summary of contributions.
-
1.
We provide several examples of vulnerable smart contracts, in which the vulnerabilities prove to be challenging to detect using state-of-the-art detection tools.
-
2.
We implement an existing analysis technique called interval analysis on top of Slither.
-
3.
We evaluate our implementation.
Paper organisation.
2 Vulnerabilities in Smart Contracts
This section contains several examples of smart contracts vulnerabilities written in Solidity. These were selected from a larger taxonomy [7]. The whole classification includes 55 vulnerabilities split among 10 categories. Both literature and existing community taxonomies were taken into account when selecting these defects. We selected these vulnerabilities because state of the art tools are not able to detect most of them and could be detectable using interval analysis.
2.1 Tautologies or Contradictions in assert or require Statements
The Solidity statements assert and require are typically used to validate boolean conditions. According to the Solidity documentation666Solidity docs: https://docs.soliditylang.org/en/v0.8.20/control-structures.html, assert is meant for checking internal errors, while require should be used to test conditions that cannot be determined until runtime. Both statements throw exceptions and revert the corresponding transactions. In their intended use, the conditions in assert should never be false as it signals contract level errors while the conditions in require can be false as they signal input errors. No matter what level of error a statement specifies, it is an issue if the conditions that they contain are tautologies or contradictions. These make the statement useless in the case of tautologies and make the transaction impossible to complete in the case of contradictions as illustrated by the following code:
1: function notGonnaExecute(uint parameter) external pure returns(uint) 2: { 3: require(parameter<0); // uint cannot be < 0 4: return parameter; 5: }
1: function uselessAssertUint(uint parameter) external pure returns(uint) 2: { 3: require(parameter>=0); // uint is always >= 0 4: return parameter; 5: }
2.2 Division by Zero
This is a classic arithmetic issue that is common among most programming languages. The Solidity compiler does not allow direct division by zero. However, the compiler cannot detect situations when the denominator could evaluate to zero. The following code snippet contains an example. The length of the recipients array is not checked before computing the amount that should be sent to each recipient (line 4):
1: function split(address[] calldata recipients) external payable 2: { 3: require(msg.value > 0,"Please provide currency to be split among recipients"); 4: uint amount = msg.value / recipients.length; // problem here if length is 0 5: for(uint index = 0; index < recipients.length; index++) 6: { 7: (bool success,) = payable(recipients[index]).callvalue:amount(""); 8: require(success,"Could not send ether to recipient"); 9: } 10: }
2.3 Integer Division Remainder
This is another arithmetic issue that is common among many programming languages. Solidity performs integer division which means that the result of the division operation is truncated. This could lead to situations where ignoring the remainder of the division could lead to logic errors. The snippet below contains an example: if the provided amount does not exactly divide by the number of recipients then that amount of cryptocurrency could remain locked in the contract.
1: function split(address[] calldata recipients) external payable 2: { 3: require(recipients.length > 0,"Empty recipients list"); 4: uint amountPerRecipient = msg.value / recipients.length; // remainder ??? 5: require(amountPerRecipient > 0,"Amount must be positive"); 6: for(uint index = 0; index < recipients.length; index++) 7: { 8: payable(recipients[index]).transfer(amountPerRecipient); 9: } 10: }
2.4 Uninitialised Variable
Uninitialized variables could lead to logical errors or exceptions. If a variable is not initialised, there is a great chance that the default value assigned to the variable (according to its type) is not suitable for the purpose of that variable. The following code contains an access modifier which relies on the owner state variable. The variable is private, and thus, it cannot be accessed or assigned outside the contract. Also, there is no explicit initialisation of owner within a constructor. This makes the variable stuck to the default value, and thus, all the functions marked with the onlyOwner modifier cannot be executed.
1: address private owner; 2: 3: modifier onlyOwner() { 4: require(msg.sender == owner, "Only the owner of the contract has access"); 5: _; 6: }
2.5 User Input Validation
Parameter validation or “sanitisation” is a process that must be implemented at the beginning of every method. This ensures that the method will always execute as expected. End users should not be trusted to always provide valid parameters. If validation is missing and the end user is unaware, or worse, malicious, it could cause critical errors that produce unexpected results or halt contract execution all together. The following example contains a getter method for an internal array. The user can provide an index that is not validated, thus having the possibility of going out of bounds.
1: uint256[] private _array= [10, 20, 30, 40, 50]; 2: 3: function getArrayElement(uint256 index) external view returns (uint256) 4: { 5: return _array[index]; 6: }
2.6 Unmatched Type
In Solidity, enums are stored as unsigned integers. Thus, they can be compared and assigned with variables of type uint. Situations like these can become tricky since the value domain of an enum is likely to be much smaller than the value domain of unsigned integers. If a variable with a greater value than the range of the enum is assigned to an enum variable, than the transaction will be reverted. While it is true that reverting the transaction is considered safe, such situations signal a faulty logic in the contract code and it is preferable to be avoided.
1: contract UnmatchedType { 2: enum Options { Candidate1, Candidate2, Candidate3 } 3: mapping(address => Options) private _votes; 4: mapping(Options => uint) private _votesCount; 5: function vote(uint option) external { 6: _votes[msg.sender] = Options(option); 7: _votesCount[Options(option)]++; 8: } 9: function getStatisticsForOption(uint option) external view returns(uint) { 10: return _votesCount[Options(option)]; 11: } 12: }
3 Detecting Vulnerabilities Using Dedicated Analysis Instruments
This section briefly presents the results of some experiments that we performed. Basically, we used a few tools for analysing smart contracts in order to check how they behave on our examples presented in Section 2. The tools that we selected are presented below. It is worth noting that we picked tools which implement different techniques (e.g., static analysis, symbolic execution, linter). For a tool to be eligible for our study, it has to be open source, active and compatible with the latest version of Solidity.
Slither is a static analysis tool written in Python. It provides vulnerability detection and code optimization advice. It features many detectors that target different issues. Its analysis runtime is very low compared to the other tools. It analyses Solidity code by transforming the EVM bytecode into an intermediary representation called SlithIR. Being an open source project, it allows anyone to contribute and improve it, being the foundation for our implementation (discussed later in Section 4). Slither was able to detect uninitialized variables as well as trivial tautologies and contradictions in our examples.
Solhint is a linter for Solidity code. An open source project, it is able to detect possible vulnerabilities, optimization opportunities and abidance to style conventions. The tool also features a customisable set of detection rules that can be employed, along with predefined configurations. The user can define its own configurations and decide which issue wants to target. Unfortunately, Solhint was not able to detect any of the issues in our examples.
Remix777Remix Docs: https://remix-ide.readthedocs.io/en/latest/ also features a static analysis plugin. We were unable to find any information about the analysis process performed by this tool. Moreover, it was unable to detect any problems in our examples.
Mythril is a tool that leverages symbolic execution to simulate multiple runs of a contract’s methods. It has a fairly long runtime compared to the others. We even encountered executions that took more than a few hours. Mythril was unable to detect any of the issues presented above.
In Table 1, we present a summary of the results that we obtained. The results indicate that nearly all tools fail to detect the vulnerabilities in our examples. This does not mean that these tools are not useful or very bad at signaling issues in smart contract code. The way we interpret these results is that these tools need to be enhanced with more powerful techniques that could increase their detection capabilities.
Examples | Slither | Solhint | Remix | Mythril |
---|---|---|---|---|
Tautologies/Contradictions | ✓ | ✗ | ✗ | ✗ |
Division by zero | ✗ | ✗ | ✗ | ✗ |
Integer division | ✗ | ✗ | ✗ | ✗ |
Uninitialised variable | ✓ | ✗ | ✗ | ✗ |
User input validation | ✗ | ✗ | ✗ | ✗ |
Unmatched type | ✗ | ✗ | ✗ | ✗ |
4 Interval Analysis for Vulnerability Detection
4.1 Interval Analysis
Interval Analysis [9] is a static analysis technique that approximates the values interval for every variable in a program for a certain instruction. The technique is not limited to predicting the values interval of a variable, it can also be used to predict certain properties that can be derived from the value of the variable. For instance, instead of working with integer intervals, an analysis can target the parity of variables and work only with 2-valued intervals (even, odd).
We present interval analysis via the Unmatched Type example from Section 2.6. Moreover, we show how interval analysis can help us detect a problem in this example, more precisely in the vote function:
5: function vote(uint option) external { 6: _votes[msg.sender] = Options(option); //Statement 1 7: _votesCount[Options(option)]++; //Statement 2 8: }
Statements | option | _votes[msg.sender] |
---|---|---|
1 | [0, max] | [0, max] |
2 | [0, max] | [0, 2] |
End | [0, max] | [0, 2] |
The function registers the vote of an user and increases the total vote count for its option. The problem is at line 2: the input is of type uint and it could easily be outside the values range [0,1,2] of the enum.
Interval analysis provides an approximation of the values interval for every program variable at each program location. In Table 2 we show how these intervals are computed for our example. Each line of the table presents the intervals for the program variables (displayed in columns) before the execution of each statement in the first column. For example, before the execution of Statement 1, we do not have any information about the option variable, so its range of values will correspond to the values domain for uint. For _votes[msg.sender], the value interval changes before Statement 2 in case of normal execution (otherwise, the transaction is reverted) to [0,2], that is, the only possible range for Option. Interval analysis performs this calculation using the Worklist Algorithm, an algorithm which traverses the program control flow graph, and updates the intervals for these variables until a fixpoint is reached. This algorithm is shown in Section 4.2.
Recall that the problem we are trying to detect using interval analysis is a mismatch of domains between the variable assigned and the variable whose value is assigned. Since interval analysis computes the interval for option and _votes[msg.sender], a close inspection of the difference between the intervals is sufficient to reveal the problem. A require statement that checks upfront the values for the option parameter would solve the problem. Also, our detection technique would not signal an issue.
4.2 An Implementation of Interval Analysis on Top of Slither
We built our implementation using Python modules provided by Slither. These are the same modules that are used internally by Slither for its own detectors. During execution, Slither fills some of its internal data structures with useful information, such as contract CFG (control flow graph), an intermediary SSA (single statement assignment) representation of the code, and information about each variable (e.g., type, scope and name). We use the information in these data structures to implement interval analysis.
The Worklist Algorithm shown in Figure 1 works by processing every edge in the contract CFG. These edges are added into a list (the ”worklist”). It is an iterative algorithm that processes existing elements until the list is empty. When new information is added to the current state, new edges are also added to the worklist. The algorithm stops when no more new information can be discovered.
We implemented a modular Worklist algorithm. Essential information such as extreme labels888The program nodes where the analysis begins., order function999A function that receives two elements of the same domain and determines the greater one. and flow function101010A function determining the edges in the flow graph or the reverse of those edges depending on the type of analysis. are all provided as parameters to the Worklist algorithm. This allow us to perform multiple types of analysis using the same base implementation. Our implementation leverages the CFG provided by Slither to split the code of a function into multiple parts. Each node is then split even further into SlithIR SSA [3]lines that are analyzed individually. Along with basic types such as uint and bool, our implementation is able to model complex types such as arrays, mappings and structs.
We defined our own data type to encapsulate information about a variable such as type, scope, name and, most importantly, values interval. The program state is represented as a dictionary having variable names as keys and an object of our own defined type as values. Complex types are defined as recursive dictionaries, for example, the interval for a struct is modeled as a dictionary containing intervals for each of its fields or even other dictionaries if the structs are nested.
Current status.
Our implementation is now able to successfully analyze programs containing assignments and arithmetic expressions for both elementary and complex types. It takes into consideration state variables, function parameters, and local variables. We are now capable of detecting issues such as:
-
•
Arithmetic issues including Division By Zero and Integer Division Remainder;
-
•
Issues related to variables initialization;
-
•
Issues related to parameter validation.
5 Conclusions
In this paper we identified some vulnerabilities that are not handled by state of the art tools for smart contract analysis. These vulnerabilities vary in their severity, but no matter the impact, defects and potential errors should be identified as soon as possible. We attempt to improve Slither with a more powerful technique called interval analysis. We explain why this technique is a good fit for detecting these issues and how it could detect them. We built a custom interval analysis on top of Slither, leveraging the information that Slither already provides about a contract, its attributes, methods, method parameters, program flow and many more. Currently, our implementation detects vulnerabilities that other tools miss.
5.1 Future Work
We are now handling only on a subset of expressions in the Solidity programming language, which covers expressions including integers, booleans, arrays, structures, and mappings. However, more elaborate work needs to be done to tackle addresses and operations over addresses, more complex loops or conditional statements, etc. Right now, our code can be executed on every smart contract written in Solidity, but it will perform interval analysis only for the subset that we cover.
Intraprocedural analysis would significantly improve the precision of our analysis. An example of a vulnerability that we are not yet able to detect is Short Address. This could be detected by monitoring the length attribute of the payload. Another example is Tautologies and Contradictions in Assert or Require Statements: it could be detected by approximating the result of the boolean expression and checking if the interval contains only one value: true for tautologies and false for contradictions.
Being able to handle more complex conditional statements and loops would also be of great help in obtaining a more accurate monitoring of the program state by interpreting the semantics of boolean expressions.
Once we identify multiple possible states based on conditional branches, we can leverage unifying techniques such as Trace Partitioning.
Additionally, monitoring implicit state variables that are contract-level or function-level, like balance or msg.sender, would be beneficial in identifying balance-related issues and user interaction problems.
References
- [1]
- [2] N. Atzei, M. Bartoletti & T. Cimoli (2017): A Survey of Attacks on Ethereum Smart Contracts (SoK). In M. Maffei & M. Ryan, editors: Principles of Security and Trust, Springer, Berlin, Heidelberg, pp. 164–186, 10.1007/978-3-662-54455-6_8.
- [3] R. Cytron, A. Lowry & F.K. Zadeck (1986): Code motion of control structures in high-level languages. In: the 13th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 70–85, 10.1145/512644.512651.
- [4] J. Feist, G. Grieco & A. Groce (2019): Slither: a static analysis framework for smart contracts. In: 2019 IEEE/ACM 2nd Intl Workshop on Emerging Trends in Software Engineering for Blockchain, IEEE, pp. 8–15, 10.1109/WETSEB.2019.00008.
- [5] I. Grishchenko, M. Maffei & C. Schneidewind (2018): A Semantic Framework for the Security Analysis of Ethereum Smart Contracts. In L. Bauer & R. Küsters, editors: Principles of Security and Trust, Springer, pp. 243–269, 10.1007/978-3-319-89722-6_10.
- [6] A. Mense & M. Flatscher (2018): Security Vulnerabilities in Ethereum Smart Contracts. In: Proceedings of the 20th International Conference on Information Integration and Web-Based Applications and Services, iiWAS2018, Association for Computing Machinery, New York, NY, USA, p. 375–380, 10.1145/3282373.3282419.
- [7] H. Rameder, M. di Angelo & G. Salzer (2022): Review of Automated Vulnerability Analysis of Smart Contracts on Ethereum. Frontiers in Blockchain 5, 10.3389/fbloc.2022.814977.
- [8] P. Tolmach, Y. Li, S.-W. Lin, Y. Liu & Z. Li (2021): A Survey of Smart Contract Formal Specification and Verification. ACM Comput. Surv. 54(7), 10.1145/3464421.
- [9] Y. Wang, Y. Gong, J. Chen, Q. Xiao & Z. Yang (2008): An Application of Interval Analysis in Software Static Analysis. In: 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, 2, pp. 367–372, 10.1109/EUC.2008.60.