Statistical differences noticed from ZMQ -> ADIOS2

AbhishekP · April 10, 2024, 3:11pm

ADIOS2 repo has been updated for sobol’ indices computation. For the time being, we cache all the numpy arrays received from independent writers and then stack them together before sobol computations.

I have noticed a few changes in stats.

Result comparisons

First of all, I compare the runs from both ZMQ (develop branch) and the current adios2-newapi branch.

I compute L2 norms for all results generated for the timestep 0 for each file i.e STUDY_OUT/results/*.001.

(melissa) apurandare@gros-121:~/MELISSA/tests/data$ python3 l2norm.py $RESULTS_DIR ~/.cache/ZMQ_RESULTS/sobol


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_variance.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_variance.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol_tot0.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol_tot0.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_mean.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_mean.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol0.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol0.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_kurtosis.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_kurtosis.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol1.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol1.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol_tot1.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol_tot1.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_skewness.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_skewness.001
L2 norm: 1.234e-15
====================================================================================================

Pytest

Pytest for tests/server/test_sensitivity_analysis_server.py has a following assertion after computing sobol stats

assert (
    abs(server.melissa_moments["field"][0][0].get_variance() - variance) / variance
    < 1e-3
)

that is producing higher values as follows

>>> np.array([8.57702183e+18]) / 4.474760921668681e+21
array([0.00191676])

Is the difference in skewness acceptable and can we increase the assertion margin for the pytest from 1e-3 to 2e-3 ?

Let me know your thoughts.

P.S: Non-sobol results have no floating point differences.

rob · May 7, 2024, 7:03am

Hey @AbhishekP ,

Did something change in the code relative to last fall? When I left, I ensured a green CI was passing all steps, including the SA integration test, which cannot pass unless the output is correct. In essence, the SA step enforced perfect continuity between non adios and adios branches, since the compared output had to be identical for a passing CI.

Therefore, I’m a bit confused as to where this would come from? Was there any changes made to sobol or to other parts of the branch to cause a failing CI?

I’ll try to look at the git history and CI history to find out.

Cheers,

Robert

rob · May 7, 2024, 7:14am

Here is the CI stage that tests adios SA and compares it to the same results as the ZMQ version:

AbhishekP · May 9, 2024, 5:21pm

Hey @rob,

It’s nothing major. I did the ZMQ run on my local machine and compared them with Adios2 run on Grid5k. So, it is just a minor precision difference, which is expected. It gives no precision error during the CI runs which are no my local machine.

On CI side, the docker (t1700) machine was not available. So, I had to modify some CI stages to be run on my local VC (lxd-runner). I am just expanding on your CI stages in adios2-newapi branch. But, there were some stages like slurm semiglobal for which I was confused a bit, cause there was no check being made for the result produced. But, if there are any errors then it’s probably on me cause we have changed the server side after the new adios2 api came out. Thanks for responding though.

Topic		Replies	Views
Sobol results changes for server ranks > 0 (Develop Branch) Bugs and Error Reporting	2	19	October 10, 2024
Adios integration General	4	67	September 23, 2024
Strange behaviour of Adios with mpirun (when group size > 1 for sobol) Bugs and Error Reporting	0	29	April 4, 2024
Iterative statistics Sensitivity Analysis	0	74	March 22, 2023
Iterative statistics update - [breaking change] Software Updates	0	52	June 29, 2023

Statistical differences noticed from ZMQ -> ADIOS2

Result comparisons

Pytest

Related topics