-
Notifications
You must be signed in to change notification settings - Fork 0
/
WasserStein_main.py
138 lines (99 loc) · 5.65 KB
/
WasserStein_main.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
'''
Main file to run the Wasserstein metric evaluation
author @mhansinger
A minimal example to demonstrate the Wasserstein metric calculation procedure for quantitative evaluation and comparison
of reactive flow simulations. The Wasserstein metric can then be computed either for one quantity in the
thermo-chemical space, e.g., only T, or in the form of a stacked metric, e.g., [T,CO,CO2, H2 ].
The latter one allows for a more holistic error analysis, as the contribution from each variable to the overall
dissimilarity (between the data sets) becomes visible. In the case of one-dimensional distributions,
the obtained value of the metric shares the same unit as the sample data. For instance, if two distributions
of temperature are considered, the corresponding Wasserstein metric in units of Kelvin can be interpreted as the
average difference between the values of temperature from the two distributions. In the case of multidimensional
distributions, each dimension is normalized before pair-wise distances are calculated. The choice of the normalization
method is application-specic. In the present study, the marginal standard deviation is chosen. The so obtained W2 represents
the averaged difference, which is proportional to the marginal standard deviations, between samples from the two distributions.
The evaluation procedure is based on th work of:
- Johnson, R., Wu, H., and Ihme, M. A general probabilistic approach for the quantitative assessment of LES combustion models.
Combust. Flame 183, pp. 88-101, 2017.
The sample data sets are from experiments and LES of the Sydney/Sandia piloted methane/air flame with inhomogeneous inlet.
Configuration: FJ-5GP-Lr75-57
Experimental data:
- Meares and Masri, Combust. Flame, 161(2), pp. 484-495, 2014.
- Barlow, Meares, Magnotti, Cutcher, and Masri, Combust. Flame, 162(10), pp. 3516-3540, 2015.
Simulation data:
- Hansinger, M., Zirwes, T., Zips, J., Pfitzner, M., Zhang, F., Habisreuther, P. and Bockhorn, H.,
The Eulerian Stochastic Fields method applied to Large Eddy Simulations of a piloted flame with inhomogeneous inlet,
Flow, Turbulence and Combustion, 2020.
The main file computes the Wasserstein metric calculation (W2), Transport matrix (Transport), and stacked Wasserstein
metric contributions (Stack) for more than one dimension (i.e. [T,CO,CO2, H2 ]).
The figures show the dissimilarity in terms of histograms and a bar plots with the stacked contribution of each dimension (species).
The plots are saved as .pdf and .tex for further LaTex production.
The stacked data is also dumped to a pickle file, for further evaluation.
See the 'requirements.txt' for additional python packages.
'''
import pandas as pd
#import dask.dataframe as dd
import numpy as np
from calcW2 import calcW2
from plotW2stack import plotW2stack
from os.path import join
import pickle
###################################
# your input
expPath = 'Exp'
simPaths = ['Sim']
# planes where the scatter data was sampled
planes = ['050','100','150','200']
# sample size, set down for faster computation
N = 100 #1000
###################################
Stacked_dict = {}
stacked_labels = ['T', 'CO2', 'CO','H2']
for simPath in simPaths:
print(simPath)
Stack_W4_planes=pd.DataFrame(columns=planes)
for plane in planes:
print(plane)
# Read in Data
print('... reading experimental data')
# Read in experimental data
headerlines = 0
expPath_ = join(expPath,'sample_scatter_xD%s.txt' % (str(plane)))
expData = pd.read_csv(expPath_, delimiter='\t', header=headerlines)
# Read in simulated data
print('... reading simulated data')
this_simPath = join(simPath ,'sample_scatter_xD%s.txt' % (str(plane)))
simData = pd.read_csv(this_simPath, delimiter='\t', header=headerlines)
print(' ')
print('... perform calculations of Wasserstein metric (each calculations takes few seconds)')
## One-Dimensional Z Comparsion
print('... cacluates Wasserstein metric')
W2_Z, Transport_Z, Stack_Z = calcW2(expData, simData, ['f_Bilger'], N, simPath, plane)
W2_CO, Transport_CO, Stack_CO = calcW2(expData, simData, ['CO'], N, simPath, plane)
W2_CO2, Transport_CO2, Stack_CO2 = calcW2(expData, simData, ['CO2'], N, simPath, plane)
W2_T, Transport_T, Stack_T = calcW2(expData, simData, ['T'], N, simPath, plane)
W2_H2, Transport_H2, Stack_H2 = calcW2(expData, simData, ['H2'], N, simPath, plane)
## Two-Dimensional Z-T Comparison
print('... cacluates Wasserstein metric for scalar(s): Z, T')
W2_ZT, Transport_ZT, Stack_ZT = calcW2(expData, simData, ['f_Bilger', 'T'], N, simPath, plane)
## Four-Dimensional Z-T-CO2-CO Comparison
print('... cacluates stacked Wasserstein metric for scalar(s): ',stacked_labels)
W2_4dim, Transport_4dim, Stack_4dim = calcW2(expData, simData, stacked_labels, N, simPath,
plane)
## Plot Stacked Contributions
print('... plot results')
Stacked_data = np.zeros((3, 4))
Stacked_data[0, 0] = Stack_Z
Stacked_data[1, 0:2] = Stack_ZT
Stacked_data[2, :] = Stack_4dim
print(Stack_4dim)
Stacked_df = pd.DataFrame(Stacked_data)
Stacked_df.columns = stacked_labels
plotW2stack(Stacked_df, simPath, plane)
Stack_W4_planes[plane] = Stack_4dim
# create new index according to labels
Stack_W4_planes.index=stacked_labels
Stacked_dict[simPath] = Stacked_df
# save stacked dict
with open('Stacked_dict.pkl', 'wb') as handle:
pickle.dump(Stacked_dict, handle, protocol=pickle.HIGHEST_PROTOCOL)