Statistical differences noticed from ZMQ -> ADIOS2

ADIOS2 repo has been updated for sobol’ indices computation. For the time being, we cache all the numpy arrays received from independent writers and then stack them together before sobol computations.

I have noticed a few changes in stats.

Result comparisons

First of all, I compare the runs from both ZMQ (develop branch) and the current adios2-newapi branch.

I compute L2 norms for all results generated for the timestep 0 for each file i.e STUDY_OUT/results/*.001.

(melissa) apurandare@gros-121:~/MELISSA/tests/data$ python3 l2norm.py $RESULTS_DIR ~/.cache/ZMQ_RESULTS/sobol


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_variance.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_variance.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol_tot0.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol_tot0.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_mean.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_mean.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol0.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol0.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_kurtosis.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_kurtosis.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol1.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol1.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_sobol_tot1.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_sobol_tot1.001
L2 norm: 0.000e+00
====================================================================================================


====================================================================================================
File1: /home/apurandare/MELISSA/examples/heat-pde/heat-pde-sa/STUDY_OUT/results/results.temperature_skewness.001
File2: /home/apurandare/.cache/ZMQ_RESULTS/sobol/results.temperature_skewness.001
L2 norm: 1.234e-15
====================================================================================================

Pytest

Pytest for tests/server/test_sensitivity_analysis_server.py has a following assertion after computing sobol stats

assert (
    abs(server.melissa_moments["field"][0][0].get_variance() - variance) / variance
    < 1e-3
)

that is producing higher values as follows

>>> np.array([8.57702183e+18]) / 4.474760921668681e+21
array([0.00191676])

Is the difference in skewness acceptable and can we increase the assertion margin for the pytest from 1e-3 to 2e-3 ?

Let me know your thoughts.

P.S: Non-sobol results have no floating point differences.

Hey @AbhishekP ,

Did something change in the code relative to last fall? When I left, I ensured a green CI was passing all steps, including the SA integration test, which cannot pass unless the output is correct. In essence, the SA step enforced perfect continuity between non adios and adios branches, since the compared output had to be identical for a passing CI.

Therefore, I’m a bit confused as to where this would come from? Was there any changes made to sobol or to other parts of the branch to cause a failing CI?

I’ll try to look at the git history and CI history to find out.

Cheers,

Robert

Here is the CI stage that tests adios SA and compares it to the same results as the ZMQ version:

Hey @rob,

It’s nothing major. I did the ZMQ run on my local machine and compared them with Adios2 run on Grid5k. So, it is just a minor precision difference, which is expected. It gives no precision error during the CI runs which are no my local machine.

On CI side, the docker (t1700) machine was not available. So, I had to modify some CI stages to be run on my local VC (lxd-runner). I am just expanding on your CI stages in adios2-newapi branch. But, there were some stages like slurm semiglobal for which I was confused a bit, cause there was no check being made for the result produced. But, if there are any errors then it’s probably on me cause we have changed the server side after the new adios2 api came out. Thanks for responding though.