Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIO: FATAL ERROR: The user buffer passed to PIO_get_var is smaller ... than the size of the variable -- possibly when trying to read restart written with different compiler #6844

Open
ndkeen opened this issue Dec 10, 2024 · 0 comments
Labels
SCORPIO The E3SM I/O library (derived from PIO)

Comments

@ndkeen
Copy link
Contributor

ndkeen commented Dec 10, 2024

For now just recording this error as it may help organize.

I have encountered this error twice now and @brhillman has also seen it (or very similar).

For me, on frontier, I see this when I tried to restart from an older set of restart files that were written with different compiler than the one used to build exe (AMD vs Cray).

end of cpl log:
(seq_infodata_Init)  read rpointer file rpointer.drv
(seq_infodata_Init)  restart file from rpointer= SCREAM.2024-autocal-00.ne1024pg2.cpl.r.2020-01-25-00000.nc

error in e3sm log:
 1025: PIO: FATAL ERROR: Aborting... An error occured,  The user buffer passed to PIO_get_var is smaller (total size of user buffer =            1  elements) than the size of the variable (total size of the variable =                   256  elements). varid =            3 , file id = \
          16. Unknown Error: Unrecognized error code (err = -501) (err=-501). Aborting since the error handler was set to PIO_INTERNAL_ERROR... (spio_get_var.F90: 739)

On Frontier, once I built with Cray compiler, it worked.

And then a ne120 case I ran on muller-cpu (with Intel compiler) was copied to pm-cpu and Xue attempted to continue run (also with Intel compiler), but hit:

630: PIO: FATAL ERROR: Aborting... An error occured, The user buffer passed to PIO_get_var is smaller (total size of user buffer =      1 elements) than the size of the variable (total size of the variable =          256 elements). varid =      3 , file id =      16. Unknown Error: Unrecognized error code (err = -501) (err=-501). Aborting since the error handler was set to PIO_INTERNAL_ERROR... (spio_get_var.F90: 739)

/pscratch/sd/x/xzheng/E3SM_simulations/more20241121.ne120pg2_r025_RRSwISC6to18E3r5.F2010-ORO-GWD_plus4K.i2.muller-cpu/tests/S_1x10_ndays

The original run (copied from muller-cpu) is
/global/cfs/cdirs/e3sm/ndk/tmpdata/20241130.ne120pg2_r025_RRSwISC6to18E3r5.F20TR.test3.muller-cpu

I've not yet investigated much for this, but verified Intel compiler (same version) was used in both cases.

@ndkeen ndkeen changed the title PIO: FATAL ERROR: The user buffer passed to PIO_get_var is smaller ... than the size of the variable -- possibly when trying to read restart written with different compiler PIO: FATAL ERROR: The user buffer passed to PIO_get_var is smaller ... than the size of the variable -- possibly when trying to read restart written with different compiler Dec 10, 2024
@ndkeen ndkeen added the SCORPIO The E3SM I/O library (derived from PIO) label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SCORPIO The E3SM I/O library (derived from PIO)
Projects
None yet
Development

No branches or pull requests

1 participant