-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce chance of string buffer overflow exploits #17
base: master
Are you sure you want to change the base?
Conversation
Gcc-10.2.0 gives many compile warnings of the risk of string buffer overflows. These risk segmentation violations with long path lengths and security problems more generally, which is why the warnings exist. This commit aims to reduce the chance of buffer overflows by using `snprintf` instead of `sprintf` and extending from just the C preprocesor macro MAX_STRING_LEN to MAX_BUF_SIZE defined as (3*MAX_STRING_LEN+40). A few unused variables are also commented out in response to the recommendations from gcc. This commit compiles cleanly for me with gcc-10.2.0, with `make OPTIONS=-fcommon`.
Thanks broukema. Let me have a look. |
Darren, my name is actually "Boud" (lower case is fine in this context, of course - boud) :), not broukema. The broukema alias is only because someone else than me grabbed 'boud' on github before I did... :P |
code/core_save.c
Outdated
@@ -57,7 +61,8 @@ void save_galaxies(int filenr, int tree) | |||
// only open the file if it is not already open. | |||
if( !save_fd[n] ) | |||
{ | |||
snprintf(buf, MAX_STRING_LEN - 1, "%s/%s_z%1.3f_%d", OutputDir, FileNameGalaxies, ZZ[ListOutputSnaps[n]], filenr); | |||
snprintf(buf, MAX_BUF_SIZE - 1, "%s/%s_z%1.3f_%d", OutputDir, FileNameGalaxies, ZZ[ListOutputSnaps[n]], filenr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since buf
is allocated as MAX_STRING_LEN
, the second parameter should be MAX_STRING_LEN - 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi Manodeep. You're right that buf
is declared to be smaller. But I would rather fix that by changing line 22
char buf[MAX_STRING_LEN];
to
char buf[MAX_BUF_SIZE];
because there are two potentially long strings - OutputDir
and FileNameGalaxies
- plus a bit of extra stuff that are written to buf
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup - good call
static const struct GALAXY_OUTPUT galaxy_output_null = {0}; | ||
struct GALAXY_OUTPUT galaxy_output = galaxy_output_null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious - what's the advantage of splitting up this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, you're welcome to drop that, depending on how interested you are in future integration with our elaphrocentre galaxy formation pipeline, which we just posted to ArXiv:2010.03742 a few days ago. If you look at line 1042 of Marius' @mpeper patch
which is applied before the other two patches at lines 1610--1612 of the current commit of
then you'll see why we did that. For a given "final" halo at the current epoch, we add up the mass infall histories of all the predecessor haloes. Since this is a loop through "final" haloes, it's safer to zero the galaxy_output structure between iterations.
You're welcome to come over to codeberg and propose improvements to our sage patches, especially for more modularity. Or you can try to convince @mpeper to modularise and tidy his patch enough to propose a pull request here ;).
@@ -20,13 +20,15 @@ FILE *load_fd; | |||
|
|||
// External Functions // | |||
|
|||
#define MAX_BUF_SIZE (3*MAX_STRING_LEN+40) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Should we check whether MAX_BUF_SIZE
is defined? Something along the lines of
#ifndef MAX_BUF_SIZE
#define MAX_BUF_SIZE ...
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but in that case there should be a consistent "universal" value is - the safe value would obviously be the biggest. My version of the pull request currently gives
$ grep --color -rn "define.*MAX_BUF_SIZE " code/*
code/core_io_tree.c:19:#define MAX_BUF_SIZE (3*MAX_STRING_LEN+20)
code/core_save.c:17:#define MAX_BUF_SIZE (2*MAX_STRING_LEN+40)
code/io/tree_binary.c:23:#define MAX_BUF_SIZE (3*MAX_STRING_LEN+40)
which is not globally consistent.
So
#ifndef MAX_BUF_SIZE
#define MAX_BUF_SIZE (3*MAX_STRING_LEN+40)
#endif
would seem the safest in all three places.
code/io/tree_binary.c
Outdated
void load_tree_table_binary(int32_t filenr) | ||
{ | ||
int i, totNHalos; | ||
char buf[MAX_STRING_LEN]; | ||
char buf[MAX_BUF_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps allocate to MAX_BUF_SIZE + 1
(to avoid passing MAX_BUF_SIZE - 1
to snprintf
). This would be true here and elsewhere in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no particular preference either way, though I would tend to agree. I guess we could ask "which convention is a typical cosmologist most likely to expect?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see that the simplest option is for me to propose an updated pull request. And that the - 1
only occurs in one place. I'm preparing an update...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the best way is to pass MAX_STRING_LEN - 1
as the second parameter to snprintf
. That way calculations for MAX_BUF_SIZE
will still be correct.
code/main.c
Outdated
@@ -16,7 +16,8 @@ | |||
|
|||
#include "io/io_save_hdf5.h" | |||
|
|||
char bufz0[1000]; | |||
#define MAX_BUFZ0_SIZE (3*MAX_STRING_LEN+20) | |||
char bufz0[MAX_BUFZ0_SIZE]; /* 3 strings + max 19 bytes for a number */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think you have the extra byte already accounted for - but might be good to change MAX_BUFZ0_SIZE to the 3*MAX_STRING_LEN + 19
, and then add the + 1
in the allocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a few (2 or 3) unused bytes is low risk. Nobody's going to hide exploitable code there. The risk is people overwriting the buffer and jumping into RAM where they're not supposed to go. Caveat: this assumes that I understand the security issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not worried about the wasted bytes - simply thinking about being explicit about the calculation. The extra + 1
would show that that was the space for terminating NULL byte.
@@ -58,6 +58,7 @@ double infall_recipe(int centralgal, int ngal, double Zcurr) | |||
infallingMass = | |||
reionization_modifier * BaryonFrac * Gal[centralgal].Mvir - (tot_stellarMass + tot_coldMass + tot_hotMass + tot_ejected + tot_BHMass + tot_ICS); | |||
// reionization_modifier * BaryonFrac * Gal[centralgal].deltaMvir - newSatBaryons; | |||
UNUSED(newSatBaryons); /* for the moment this is not used */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea
@@ -7,7 +7,7 @@ | |||
#include "core_allvars.h" | |||
#include "core_proto.h" | |||
|
|||
|
|||
#define UNUSED(foo) (void)(foo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should move this to core_allvars.h
. Or at least undefine UNUSED
at the bottom of the file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it were shifted to core_allvars.h
, then yes, I agree that undefining UNUSED
at the end would make sense. On the other hand, core/model_infall.c
is not going to be included in any other file, so I'm not convinced of the point of undefining it.
|
||
sprintf(fname, "%s", FileWithSnapList); | ||
snprintf(fname, MAX_STRING_LEN, "%s", FileWithSnapList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this comment further down - but would be good to either allocate with a MAX_STRING_LEN + 1
or pass MAX_STRING_LEN - 1
to snprintf
code/core_read_parameter_file.c
Outdated
@@ -10,7 +10,8 @@ | |||
void read_parameter_file(char *fname) | |||
{ | |||
FILE *fd; | |||
char buf[MAX_STRING_LEN], buf1[MAX_STRING_LEN]; | |||
#define MAX_BUF_SIZE_FILE_LIST (3*MAX_STRING_LEN) | |||
char buf[MAX_BUF_SIZE_FILE_LIST], buf1[MAX_STRING_LEN]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are fixing this, may be we should put buf1
in the next line
While I said "request changes", really the only necessary change is the first one |
And many thanks for this PR and improving the code quality! |
This commit implements several small fixes more or less as proposed by Manodeep Singh at darrencroton#17 . These compile for me without error either with gcc-10.2.0 using `make OPTIONS=-fcommon` or with gcc-6.3.0-18+deb9u1. I don't have a convenient regression test ready to test it, so feel free to do a regression test before merging. The previous version worked as implemented in the full `elaphrocentre` galaxy formation pipeline - https://codeberg.org/boud/elaphrocentre . Possible TODO: There's a section of `code/io/tree_hdf5.c` with `char fname[1000]` and several `sprintf` statements. These should are not warned about by gcc-10.2.0 and there's probably no point being pedantic and switching them to `snprintf` statements.
This commit implements several small fixes more or less as proposed by Manodeep Sinha at darrencroton#17 . These compile for me without error either with gcc-10.2.0 using `make OPTIONS=-fcommon` or with gcc-6.3.0-18+deb9u1. I don't have a convenient regression test ready to test it, so feel free to do a regression test before merging. The previous version worked as implemented in the full `elaphrocentre` galaxy formation pipeline - https://codeberg.org/boud/elaphrocentre . Possible TODO: There's a section of `code/io/tree_hdf5.c` with `char fname[1000]` and several `sprintf` statements. These should are not warned about by gcc-10.2.0 and there's probably no point being pedantic and switching them to `snprintf` statements.
@manodeep So I see that "conversation" is another non-git service that github provides on top of git and risks creating vendor lock-in, because it's not easy to shift to other git repository hosters, and is nevertheless useful for people who wish to trace what happened, when and why in order to debug or tidy up code. Anyway, I've updated this pull request. Thanks for the review! :) |
@@ -120,7 +121,7 @@ int main(int argc, char **argv) | |||
else | |||
fclose(fd); | |||
|
|||
sprintf(bufz0, "%s/%s_z%1.3f_%d", OutputDir, FileNameGalaxies, ZZ[ListOutputSnaps[0]], filenr); | |||
snprintf(bufz0, MAX_BUFZ0_SIZE, "%s/%s_z%1.3f_%d", OutputDir, FileNameGalaxies, ZZ[ListOutputSnaps[0]], filenr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second parameter should be MAX_BUFZ0_SIZE - 1
@@ -111,7 +112,7 @@ int main(int argc, char **argv) | |||
for(filenr = FirstFile; filenr <= LastFile; filenr++) | |||
#endif | |||
{ | |||
sprintf(bufz0, "%s/%s.%d%s", SimulationDir, TreeName, filenr, TreeExtension); | |||
snprintf(bufz0, MAX_BUFZ0_SIZE, "%s/%s.%d%s", SimulationDir, TreeName, filenr, TreeExtension); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second parameter should be MAX_BUFZ0_SIZE - 1
Gcc-10.2.0 gives many compile warnings of the risk of string
buffer overflows. The uncontrolled
sprintf
commands risk segmentation violations if longpath lengths are used, and risk security problems more generally, which is why
the warnings exist.
This commit aims to reduce the chance of buffer overflows by
using
snprintf
instead ofsprintf
and extending from justthe C preprocesor macro MAX_STRING_LEN to extra macros MAX_BUF_SIZE
and MAX_OUTFILE_SIZE, defined e.g.
as (3*MAX_STRING_LEN+40). A few unused variables are also
commented out in response to the recommendations from gcc.
This commit compiles cleanly for me with gcc-10.2.0, with
make OPTIONS=-fcommon
to work around Issue #16 .