Formatting Instructions for ICLR 2023
Conference Submissions
Abstract
The abstract paragraph should be indented 1/2 inch (3 picas) on both left and right-hand margins. Use 10 point type, with a vertical spacing of 11 points. The word Abstract must be centered, in small caps, and in point size 12. Two line spaces precede the abstract. The abstract must be limited to one paragraph.
1 Introduction
Deep learning has had tremendous success in a wide range of domains, such as vision (he2016deep), language (brown2020language), and playing games at superhuman levels (mnih2015human; silver2016mastering; vinyals2019grandmaster). \rebuttal Yet despite these accomplishments, these systems remain limited in their formal and mathematical reasoning abilities (saxton2018analysing; cobbe2021training; hendrycksmath2021). Although there have be recent impressive gains lewkowycz2022solving, the models remain challenged to succeed at harder problems.
Recent work suggest that neural networks, like humans, benefit from relying on a chain of reasoning steps rather than attempting to produce the final output as a direct mapping from the problem prompt (recchia2021teaching; nye2021show; hendrycksmath2021; cobbe2021training; lewkowycz2022solving). \rebuttal These works rely entirely on naturalistic data and manipulations, in the sense that problems and their step-wise solutions are taken as they are found in existing sources, or human annotators are asked to produce a sequence of solution steps using numbers interspersed with natural language. However, while naturalistic sentences are certainly how we often communicate our solutions to each other informally, we argue that formal and mathematical reasoning depends on identifying and exploiting the set of abstract relationships that underlies the details of the problem at hand. Even in settings where the focus is on the step-wise manipulation of quantities to obtain valid practical results, a set of abstract relationships underlies the sequence of operations.
We build on this intuition by exploring the possibility that, if a problem-solver can formulate the problem under consideration at an abstract level, this will be conducive to finding the correct sequence of more specific arithmetic operations. However, to our knowledge, no math dataset currently exists that utilizes natural language and also isolates key reasoning components such as entities and their relations, i.e. there is no way to train the model to convert natural language inputs into these core elements. We address this gap by proposing a new dataset, \gsmr, by expanding on the GSM8K dataset (cobbe2021training), a dataset containing grade-school level math word problems, with human annotations that highlight the relational abstractions that are central to mathematical reasoning. We also introduce a new synthetic task, called the unit conversion (UC) task, in which the abstract relational problem is reduced to its essence that enables controlled analyses without the complications that arise from naturalistic datasets.
At their core, both tasks involve reasoning about how different quantities relate to each other, and formulating appropriate arithmetic equations to perform the corresponding numerical computations. We can decompose each step of the solution into abstract relational reasoning and arithmetic expressions, which can then be used to recompose the solution sequence in different forms.
We summarize our main contributions as follows:
-
[leftmargin=*]
-
•
\rebuttal
We decompose the problem solving process into identifying the relevant abstract relationships and performing the corresponding arithmetic manipulations.
-
•
\rebuttal
We present a new dataset called \gsmrthat adds relational abstraction annotations to the original GSM8K dataset (cobbe2021training) (to be released with the paper).
-
•
\rebuttal
We introduce the new synthetic task Unit Conversion task that brings out the importance of engaging with the relational abstractions, even in smaller transformer models.
-
•
\rebuttal
We find that teaching models to identify the relevant abstract relationships on trained problems can lead to substantial performance gains at test, and identify several factors affecting this outcome.
-
•
\rebuttal
We find that identifying the crucial abstract relationships remains a challenge, and that providing the relational abstraction at test time can produce drastic gains.
Taken together, we believe these findings highlight the importance of identifying the relevant abstract relations to enable correct formal and mathematical reasoning. In the discussion, we consider next steps that may allow the development of artificial systems that capture this ability.
2 Submission of conference papers to ICLR 2023
ICLR requires electronic submissions, processed by https://openreview.net/. See ICLR’s website for more instructions.
If your paper is ultimately accepted, the statement \iclrfinalcopy should be inserted to adjust the format to the camera ready requirements.
The format for the submissions is a variant of the NeurIPS format. Please read carefully the instructions below, and follow them faithfully.
2.1 Style
Papers to be submitted to ICLR 2023 must be prepared according to the instructions presented here.
Authors are required to use the ICLR LaTeX style files obtainable at the ICLR website. Please make sure you use the current files and not previous versions. Tweaking the style files may be grounds for rejection.
2.2 Retrieval of style files
The style files for ICLR and other conference information are available online at:
The file iclr2023_conference.pdf
contains these
instructions and illustrates the
various formatting requirements your ICLR paper must satisfy.
Submissions must be made using LaTeX and the style files
iclr2023_conference.sty
and iclr2023_conference.bst
(to be used with LaTeX2e). The file
iclr2023_conference.tex
may be used as a “shell” for writing your paper. All you
have to do is replace the author, title, abstract, and text of the paper with
your own.
3 General formatting instructions
The text must be confined within a rectangle 5.5 inches (33 picas) wide and 9 inches (54 picas) long. The left margin is 1.5 inch (9 picas). Use 10 point type with a vertical spacing of 11 points. Times New Roman is the preferred typeface throughout. Paragraphs are separated by 1/2 line space, with no indentation.
Paper title is 17 point, in small caps and left-aligned. All pages should start at 1 inch (6 picas) from the top of the page.
Authors’ names are set in boldface, and each name is placed above its corresponding address. The lead author’s name is to be listed first, and the co-authors’ names are set to follow. Authors sharing the same address can be on the same line.
Please pay special attention to the instructions in section 5 regarding figures, tables, acknowledgments, and references.
There will be a strict upper limit of 9 pages for the main text of the initial submission, with unlimited additional pages for citations.
4 Headings: first level
First level headings are in small caps, flush left and in point size 12. One line space before the first level heading and 1/2 line space after the first level heading.
4.1 Headings: second level
Second level headings are in small caps, flush left and in point size 10. One line space before the second level heading and 1/2 line space after the second level heading.
4.1.1 Headings: third level
Third level headings are in small caps, flush left and in point size 10. One line space before the third level heading and 1/2 line space after the third level heading.
5 Citations, figures, tables, references
These instructions apply to everyone, regardless of the formatter being used.
5.1 Citations within the text
Citations within the text should be based on the natbib package
and include the authors’ last names and year (with the “et al.” construct
for more than two authors). When the authors or the publication are
included in the sentence, the citation should not be in parenthesis using \citet{}
(as
in “See Hinton06 for more information.”). Otherwise, the citation
should be in parenthesis using \citep{}
(as in “Deep learning shows promise to make progress
towards AI (Bengio+chapter2007).”).
The corresponding references are to be listed in alphabetical order of authors, in the References section. As to the format of the references themselves, any style is acceptable as long as it is used consistently.
5.2 Footnotes
Indicate footnotes with a number111Sample of the first footnote in the text. Place the footnotes at the bottom of the page on which they appear. Precede the footnote with a horizontal rule of 2 inches (12 picas).222Sample of the second footnote
5.3 Figures
All artwork must be neat, clean, and legible. Lines should be dark enough for purposes of reproduction; art work should not be hand-drawn. The figure number and caption always appear after the figure. Place one line space before the figure caption, and one line space after the figure. The figure caption is lower case (except for first word and proper nouns); figures are numbered consecutively.
Make sure the figure caption does not get separated from the figure. Leave sufficient space to avoid splitting the figure and figure caption.
You may use color figures. However, it is best for the figure captions and the paper body to make sense if the paper is printed either in black/white or in color.
5.4 Tables
All tables must be centered, neat, clean and legible. Do not use hand-drawn tables. The table number and title always appear before the table. See Table 1.
Place one line space before the table title, one line space after the table title, and one line space after the table. The table title must be lower case (except for first word and proper nouns); tables are numbered consecutively.
PART | DESCRIPTION |
---|---|
Dendrite | Input terminal |
Axon | Output terminal |
Soma | Cell body (contains cell nucleus) |
6 Default Notation
In an attempt to encourage standardized notation, we have included the notation file from the textbook, Deep Learning goodfellow2016deep available at https://github.com/goodfeli/dlbook_notation/. Use of this style is not required and can be disabled by commenting out math_commands.tex.
Numbers and Arrays
A scalar (integer or real) | |
A vector | |
A matrix | |
A tensor | |
Identity matrix with rows and columns | |
Identity matrix with dimensionality implied by context | |
Standard basis vector with a 1 at position | |
A square, diagonal matrix with diagonal entries given by | |
a | A scalar random variable |
A vector-valued random variable | |
A matrix-valued random variable |
Sets and Graphs
A set | |
The set of real numbers | |
The set containing 0 and 1 | |
The set of all integers between and | |
The real interval including and | |
The real interval excluding but including | |
Set subtraction, i.e., the set containing the elements of that are not in | |
A graph | |
The parents of in |
Indexing
Element of vector , with indexing starting at 1 | |
All elements of vector except for element | |
Element of matrix | |
Row of matrix | |
Column of matrix | |
Element of a 3-D tensor | |
2-D slice of a 3-D tensor | |
Element of the random vector |
Calculus
Derivative of with respect to | |
Partial derivative of with respect to | |
Gradient of with respect to | |
Matrix derivatives of with respect to | |
Tensor containing derivatives of with respect to | |
Jacobian matrix of | |
The Hessian matrix of at input point | |
Definite integral over the entire domain of | |
Definite integral with respect to over the set |
Probability and Information Theory
A probability distribution over a discrete variable | |
A probability distribution over a continuous variable, or over a variable whose type has not been specified | |
Random variable a has distribution | |
Expectation of with respect to | |
Variance of under | |
Covariance of and under | |
Shannon entropy of the random variable x | |
Kullback-Leibler divergence of P and Q | |
Gaussian distribution over with mean and covariance |
Functions
The function with domain and range | |
Composition of the functions and | |
A function of parametrized by . (Sometimes we write and omit the argument to lighten notation) | |
Natural logarithm of | |
Logistic sigmoid, | |
Softplus, | |
norm of | |
norm of | |
Positive part of , i.e., | |
is 1 if the condition is true, 0 otherwise |
7 Final instructions
Do not change any aspects of the formatting parameters in the style files. In particular, do not modify the width or length of the rectangle the text should fit into, and do not change font sizes (except perhaps in the References section; see below). Please note that pages should be numbered.
8 Preparing PostScript or PDF files
Please prepare PostScript or PDF files with paper size “US Letter”, and not, for example, “A4”. The -t letter option on dvips will produce US Letter files.
Consider directly generating PDF files using pdflatex
(especially if you are a MiKTeX user).
PDF figures must be substituted for EPS figures, however.
Otherwise, please generate your PostScript and PDF files with the following commands:
dvips mypaper.dvi -t letter -Ppdf -G0 -o mypaper.ps ps2pdf mypaper.ps mypaper.pdf
8.1 Margins in LaTeX
Most of the margin problems come from figures positioned by hand using
\special
or other commands. We suggest using the command
\includegraphics
from the graphicx package. Always specify the figure width as a multiple of
the line width as in the example below using .eps graphics
\usepackage[dvips]{graphicx} ... \includegraphics[width=0.8\linewidth]{myfile.eps}
or
\usepackage[pdftex]{graphicx} ... \includegraphics[width=0.8\linewidth]{myfile.pdf}
for .pdf graphics. See section 4.4 in the graphics bundle documentation (http://www.ctan.org/tex-archive/macros/latex/required/graphics/grfguide.ps)
A number of width problems arise when LaTeX cannot properly hyphenate a
line. Please give LaTeX hyphenation hints using the \-
command.
Author Contributions
If you’d like to, you may include a section for author contributions as is done in many journals. This is optional and at the discretion of the authors.
Acknowledgments
Use unnumbered third level headings for the acknowledgments. All acknowledgments, including those to funding agencies, go at the end of the paper.
Appendix A Appendix
You may include other additional sections here.