## Parameter Estimation for Differential Equations using Scipy Least Square

The post is about parameter estimation based on gradient descent specifically for the differential equation $\frac{dn}{dt}=G-k_1 n-k_2 n^2-k_3 n^3$, i.e. we are looking for a good $(G,k_1,k_2,k_3)$ that best fits the experimental data. We will go through the methods of choosing initial parameter guesses and running the optimization. This post is more fully explained in this note: Differential Equation Parameter Estimations. The data used here is a fraction of the sample data I was given. The codes and sample data can also be found here.

As usual, do use virtual environment for cleaner package management (you can see here). The packages needed can be installed via pip.

pip install numpy
pip install scipy
pip install matplotlib

Put the scripts scipyode_header.pycubicODE.py and cubicODE_post_processing.py into the project folder – in the virtual environment if you are using one. The code can be found at the end of the post.

1. PART 1. Loading data. In scipyode_header.py, customize load_data() as you see fit. As it is, it will load the data starting from a maximum point, and assume that the csv file contains 2 columns: column 1 is for variable t, column 2 is n. The file does not have any header.
Go into cubicODE.py, set the following toggles.

 plot_data True initial_guess_mode False optimize_mode False

Run cubicODE.py. You will see figure like figure 1. In this example, the relevant data points (blue) are the points that overlap with the orange plot.

2. PART 1. Define the ODE. Write the ODE in def f(t,n,f_args).
3. PART 2. Choosing initial guesses. Set any number of initial guesses you want to try out with by setting collection_of_initial_guesses.  Since we optimize [G,k1,k2,k3], an example guess is [0,1e-3,0.5e-3,1.2e-7]. Set the following toggles and then run cubicODE.py. Plot like figure 1 will be generated.
 plot_data False initial_guess_mode True optimize_mode False
4. PART 2. Optimization. Once the choices of initial guesses are decided upon, set the following toggles and then run cubicODE.py. The optimized parameters from each initial guess will be saved as a .par file, and the save name is set by the variable save_name. The code in our example used 3 initial guesses and save_name = “20190110_cubic”. Thus, 20190110_cubic_1.par, 20190110_cubic_2.par and 20190110_cubic_3.par will be created.
 plot_data False initial_guess_mode False optimize_mode True
5. Loading optimized parameters. Go to cubicODE_post_processing.py. If we want to load all the 3 saved .par files, then set load_name = “20190110_cubic” and serial_number = [1,2,3]. Then run cubicODE_post_processing.py. The plot results are shown in figure 2(A, B, C), showing good fit. The optimized parameters for each initial guess can be read off from the console, as in figure 2(D). The smaller cost value is, the better the fit is generally.

The optimized parameters seem to produce good fitting. We are done! Figure 1.Green plot shows the experimental data points while the rest are generated using RK45 using 3 different initial guesses, given by collection_of_initial_guesses = [  [0,1e-3,0.5e-3,1.2e-7], [0,1e-2,0,1e-7], [0,0,1.5e-3,1e-7] ]. Figure 2. (A, B, C) shows the results of parameter estimation from 3 different initial guesses. (D) shows the results in the console. For each serial number, the optimized parameters are printed as [G, k1, k2, k3].

import numpy as np
from scipy.integrate import RK45
import matplotlib.pyplot as plt
import csv

def RK45_wrapper(x0, time_checkpoints, f, f_args, stepsize=None):
if stepsize is None:
stepsize = 0.01*(time_checkpoints[-1]-time_checkpoints)
t_set = [time_checkpoints]
x_set = [x0]
t_checkpoint_new = [time_checkpoints]
x_checkpoint = [x0]

f_temp = lambda t,y : f(t,y, f_args)

for i in range(len(time_checkpoints)-1):
t1 = time_checkpoints[i+1]
if i>0:
x0 = ox.y
t0 = ox.t

else:
t0 = time_checkpoints[i]
ox = RK45(f_temp,t0,x0,t1,max_step=stepsize)

while ox.t < t1:
msg = ox.step()
t_set.append(ox.t)
x_set.append(ox.y)
x_checkpoint.append(ox.y)
t_checkpoint_new.append(ox.t)
return x_set, t_set, x_checkpoint, t_checkpoint_new

# filename : string, name of csv file without extension
# headerskip : nonnegative integer, skip the first few rows
#   if all rows are to be read, set to zero

# example
# xx = [[np.float(x) for x in y] for y in xx] # nice here
# print(xx)
# print(np.sum(xx))

out = []
with open(str(filename)+'.csv') as csv_file:
count = 0
i = 0
count = count + 1
else:
out.append(row)
if get_the_first_N_rows>0:
i = i + 1
if i == get_the_first_N_rows:
break
return out

filename = 'sample_data'
all_data = [[np.float(x) for x in y] for y in all_data]

X=[x for x in all_data]
Y=[x for x in all_data]

max_index = np.argmax(Y)
relevant_data= all_data[max_index:]
Xr = [x for x in relevant_data]
Yr = [x for x in relevant_data]

print("     data size = ", len(Xr))
print("     peak = ", X[max_index],",", Y[max_index])
if plot_data:
fig = plt.figure()
ax.scatter(X,Y,3)
ax.scatter(X[max_index],Y[max_index])
ax.plot(Xr,Yr,'r')
ax.set_xlabel("t")
ax.set_ylabel("n")
plt.show()

return Xr, Yr

cubicODE.py

import numpy as np
import matplotlib.pyplot as plt
import pickle
from scipy.optimize import least_squares

# TOGGLES:
initial_guess_mode = 0
optimize_mode = 1
# SETTINGS:
save_name = "20190110_cubic"

# --------------- PART 1 ----------------
#
# Load data, prepare the differential equation

def f(t,n, f_args): # function for pendulum motion with friction
dndt = [f_args - f_args * n - f_args * n**2 - f_args * n**3]
return dndt

x0 = np.array([Yr])
t_set_expt = [np.array([x]) for x in Xr]
n_set_expt = [np.array([x]) for x in Yr]
print("x0 = ",x0)
print("Xr[:5] = ",Xr[:5])
print("Yr[:5] = ",Yr[:5])
print("t_set_expt[:5] = ",t_set_expt[:5])
print("n_set_expt[:5] = ",n_set_expt[:5])

# --------------- PART 2 ----------------
#
# try out initial guesses

# initial guess [G, k1, k2, k3]
collection_of_initial_guesses = [
[0,1e-3,0.5e-3,1.2e-7],
[0,1e-2,0,1e-7],
[0,0,1.5e-3,1e-7]
]

if initial_guess_mode:
print("\n------------- INITIAL TESTING ----------------\n")
color_scheme = np.linspace(0,1,len(collection_of_initial_guesses))
fig = plt.figure()
ax.scatter(Xr,Yr,3,color='g',label="expt data")
for i in range(len(collection_of_initial_guesses)):
f_args0 = collection_of_initial_guesses[i]
# print(f_args0)
this_label = "initial guess" + str(i+1)
_, _, x_test, t_test = sh.RK45_wrapper(x0, t_set_expt, f, f_args0, stepsize=None)
ax.scatter(t_test,x_test,3,color=(1-color_scheme[i],0,color_scheme[i]),label=this_label)
ax.legend()
ax.set_xlabel("t")
ax.set_ylabel("n")

if optimize_mode:
def target_function(f_args):
# "checkpoint" here refers to the intended data points
#   since RK45 will generate more points in between the checkpoints.
#   What happens is just that with smaller time steps in between the checkpoints,
#   we can compute the checkpoints more precisely.
_, _, x_checkpoint, t_checkpoint_new = sh.RK45_wrapper(x0, t_set_expt, f, f_args, stepsize=None)

collection_of_f_i = []
# each i is a data point in this particular example
for x1, x2 in zip(x_checkpoint,n_set_expt):
this_norm = np.linalg.norm(np.array(x1)-np.array(x2))
collection_of_f_i.append(this_norm)
return np.array(collection_of_f_i)
print("\n------------- OPTIMIZATION STAGE ----------------\n")

for i in range(len(collection_of_initial_guesses)):
f_args0 = collection_of_initial_guesses[i]
print(" initial guess = ", f_args0)
optimized_res = least_squares(target_function,f_args0)
f_args_optimized = optimized_res.x

SAVE_DATA = {
"initial_guess": f_args0,
"optimized": f_args_optimized,
"cost":optimized_res.cost
}

print(" + optimized result = ",f_args_optimized)
print(" + cost = ",optimized_res.cost)
output = open(save_name+"_"+ str(i+1) +".par", 'wb')
pickle.dump(SAVE_DATA, output)
output.close()

plt.show()

cubicODE_post_processing.py

import numpy as np
import matplotlib.pyplot as plt
import pickle

# SETTINGS
serial_number = [1,2,3]

def f(t,n, f_args): # function for pendulum motion with friction
dndt = [f_args - f_args * n - f_args * n**2 - f_args * n**3]
return dndt
x0 = np.array([Yr])
t_set_expt = [np.array([x]) for x in Xr]
n_set_expt = [np.array([x]) for x in Yr]
print("x0 = ",x0)
print("Xr[:5] = ",Xr[:5])
print("Yr[:5] = ",Yr[:5])
print("t_set_expt[:5] = ",t_set_expt[:5])
print("n_set_expt[:5] = ",n_set_expt[:5])

print("\n------------- PLOTTING ----------------\n")
color_scheme = np.linspace(0,1,len(serial_number))

for j in range(len(serial_number)):
fig = plt.figure()
ax.scatter(Xr,Yr,3,color='g',label="expt data")
i = serial_number[j]
this_label = "initial guess" + str(i+1)
SAVE_DATA = None
pkl_file.close()
# print(SAVE_DATA)
f_args_initial = SAVE_DATA["initial_guess"]
f_args_optimized = SAVE_DATA["optimized"]
cost  = SAVE_DATA["cost"]
_, _, x_op, t_op = sh.RK45_wrapper(x0, t_set_expt, f, f_args_optimized, stepsize=None)
print("Serial number:",j+1, " - Optimized parameters = " )
print(" ",f_args_optimized)
print(" cost = ", cost)
ax.plot(t_op,x_op,color=(1-color_scheme[j],0,color_scheme[j]),label=this_label)
ax.legend()
ax.set_title(title_msg)
ax.set_xlabel("t")
ax.set_ylabel("n")

plt.show()

## Differential Equation Parameter Estimations

The following is the preview for the following note: differential equation parameter estimations. You can see the related posts in the following: ## Parameter Estimation for Differential Equations using Gradient Descent

The post is about parameter estimation based on gradient descent specifically for the differential equation $\frac{dn}{dt}=G-k_1 n-k_2 n^2-k_3 n^3$, i.e. we are looking for a good $(G,k_1,k_2,k_3)$ that best fits the experimental data. We will go through the methods of choosing initial parameter guesses and running the optimization. This post is more fully explained in this note: Differential Equation Parameter Estimations. The data used here is a fraction of the sample data I was given. The codes and sample data can also be found here.

Consider this link instead Parameter Estimation for Differential Equations using Scipy Least Square since it utilizes more robust optimization component from scipy.

Place fitting.py, fitting_post_processing.py and param_estimate.py (the codes are at the end of the post) in the project folder. Some required python packages include matplotlib, scipy and kero; do pip install them.

1. In the command line, cd into the directory where the project is.
2. Preliminary. Prepare sample_data.csv containing data points (t,n). First column is for t (horizontal axis), second column is n (vertical axis). Set the following toggles in fitting.py.
 plot_data True do_initial_testing False start_op False

Run python fitting.py. This plots the sample data, shown in figure 1. In this context, the relevant data points are the data after the peak. The orange curve shows the plot of only the extracted relevant data.

3. Initial testing stage. Set the following toggles in fitting.py.
 plot_data False do_initial_testing True start_op False

Set collection_of_p_init. You can place as many guesses as you want. In this example,

collection_of_p_init =[
[0,1e-3,0.5e-3,1.2e-7], # guess 1
[0,1e-2,0,1e-7], # guess 2
[0,0,1.5e-3,1e-7], # guess 3
]

we try using 3 different initial parameters, where each of them is of the format [G,k1,k2,k3], corresponding to the 4 parameters that we want to fit. Then, run the command:  python fitting.py. The results are shown in figure 2. It can be seen that the first and third guesses are closer to the experimental data; we might want to use the parameters near these values for random initiation in the optimization stage (but for demonstration sake, we will not do it here).

4. Optimization stage. Set the following toggles in fitting.py.
 plot_data False do_initial_testing False start_op True

Set the preferred settings. p_init_max and p_init_min are the range of random initial parameters to try from. We can set the number of attempts via the variable no_of_tries, and of course each attempt starts with a random initial parameter based on the range we just specified. Run the command:  python fitting.py. You will see optimization in progress with decreasing mse values.

5. Viewing the optimization result. We want to load all 3 fitting_data files we have created. Thus, in fitting _post_processing.py, set
filename = 'fitting_data'
save_data_series = [1,2,3]

Run the command:  python fitting _post_processing.py. Figure 3 shows how after every iteration, the resulting curve is approaching the experimental data.

Finally, we want to see what is the best parameter p. When viewing the optimization result, the parameters will be printed. We are more or less done!

What else to do?

In this post I have only run a few iterations over a few number of random parameter initializations. To obtain the best results, we should run them over larger iterations and try more random initial parameters.

That’s all, have fun~ Figure 1. A plot of n-t from the sample data is in blue. The orange data is in

Figure 2. The curves show 3 different plots from 3 different guesses of initial parameters. The red curve is the experimental data. The console shows recommended setting for learning_rate variable. Figure 3. As the algorithm is iterated, the curve goes from red to blue. The figures generally show that the curve approaches the experimental data and that MSE trend decreases along the iteration of i.

fitting.py

import numpy as np
import kero.utils.utils as ut
import param_estimate as pe
import matplotlib.pyplot as plt
from scipy.integrate import RK45
import csv
import pickle

es = pe.DE_param_estimator()

x0 = np.array([Yr])
t_set_expt = [np.array([x]) for x in Xr]
n_set_expt = [np.array([x]) for x in Yr]
print("x0 = ",x0)
print("Xr[:5] = ",Xr[:5])
print("Yr[:5] = ",Yr[:5])
print("t_set_expt[:5] = ",t_set_expt[:5])
print("n_set_expt[:5] = ",n_set_expt[:5])

#.#

print("----------------- Prepare Model ---------------------------")
# MODEL used to fit this data
#
# We are estimating the parameters of the following differential equations:
#   dn/dt = F(n, p) = G - k1 * n - k2 * n**2 - k3 * n**3
#   where p is the parameter (G,k1,k2,k3)
def F(y,p):
return p - p * y -p * y**2 - p * y**3
diff_part_G = lambda x: 1
diff_part_k1 = lambda x: -x
diff_part_k2 = lambda x: -x**2
diff_part_k3 = lambda x: -x**3
list_of_derivatives = [diff_part_G, diff_part_k1, diff_part_k2, diff_part_k3]
print("  Model prepared.")
#.#

do_initial_testing = 0
if do_initial_testing:
print("---------------- Prepare Initial guess --------------------")
# Before optimizing, try playing around the p_init values here
# We will choose the range of uniform random values for guess p based on tests conducted here
# Write down choices that seem good and any remarks here
#
#   1. p_init = [0,5e-2,2e-3,1.1e-7] # near the real value. all n less than expt data
#   2. p_init = [0,5e-2,1e-3,1.1e-7] # seems to be better than 1
#   3. p_init = [0,5e-2,0,1.1e-7] # all n above expt data
#   4. p_init = [0,1e-1,1e-4,1.1e-7] # small n above expt data, larger n below
#   5. p_init = [0,1e-2,1e-3,1.1e-7] # exceedingly close! Let's vary around these values instead
#
#
#

collection_of_p_init =[
[0,1e-3,0.5e-3,1.2e-7], # guess 1
[0,1e-2,0,1e-7], # guess 2
[0,0,1.5e-3,1e-7], # guess 3
]

fig = plt.figure()
no_of_test = len(collection_of_p_init)
for i in range(no_of_test):
p_init = collection_of_p_init[i] # guess p
time_checkpoints = t_set_expt
def f(t,y):
return np.array([F(y,p_init)])
_,_, x_checkpoint, t_checkpoint_new = pe.RK45_wrapper(x0, time_checkpoints, f, stepsize=None)
ax.plot(t_checkpoint_new,x_checkpoint, label="guess "+str(i+1), color=(0,1-1*(i+1)/no_of_test,1*(i+1)/no_of_test))

# To find what is the best learning_rate
print(" testing initial guess parameters: ",i+1)
es.update_p(t_set_expt, n_set_expt, F, p_init, list_of_derivatives, verbose=11)
ax.plot(Xr,Yr, color = "r", label="expt values")
ax.legend( )
ax.set_xlabel("t")
ax.set_ylabel("n")

plt.show()
#.#

# ------------------------------------------------------------------------------- #
start_op = 1
if start_op:
print("----------------- Start Optimizing ---------------------------")

# ********* Settings *************
p_init_max = [1e-10,0 ,0,0.9e-7] # [Gmax,k1max,k2max,k3max]
p_init_min = [1e-10, 2e-2,2e-3,1.2e-7] # [Gmin ,k1min,k2min,k3min]
no_of_tries = 3
save_name = 'fitting_data'
es.learning_rate= 1e-16
no_of_iterations = 10
save_interval = 1
# ********************************

n_set_expt_MAT = pe.RK45_output_to_list(n_set_expt) # MATRIX list form

for j in range(no_of_tries):
p_init = []
for a,b in zip(p_init_min, p_init_max):
p_init.append(np.random.uniform(a,b))
p_now= p_init
save_count = 0
SAVE_DATA = []
print("TRY ", j+1," / ",no_of_tries)
mse_list = []
for i in range(no_of_iterations + 1):
if i>0:
# for i =0 , initial state, do not iterate yet
p_now = es.update_p(t_set_expt, n_set_expt, F, p_now, list_of_derivatives, verbose=0)

#------------------------ FOR REALTIME OBSERVATION -----------
def f(t,y):
return np.array([F(y,p_now)])
time_checkpoints = t_set_expt
_,_, n_set_next, t_set_next = pe.RK45_wrapper(x0, time_checkpoints, f, stepsize=None)
n_set_next_MAT = pe.RK45_output_to_list(n_set_next)
mse = pe.MSE(n_set_next_MAT,n_set_expt_MAT)
mse_list.append(mse)
print(" i = ", i , " -- > mse = ", mse)
# print("   p_new = ", p_now)

#------------------------ TO BE SAVED -------------------------
save_count = save_count + 1
if save_count == save_interval or i==0:
save_count = 0
data_store = {
"t_set":t_set_next,
"n_set":n_set_next,
"p":p_now,
"learning_rate":es.learning_rate,
"mse_list": mse_list
}
SAVE_DATA.append(data_store)
output = open(save_name+"_"+ str(j+1) +".par", 'wb')
pickle.dump(SAVE_DATA, output)
output.close()

print("\nClosing Program...")

fitting_post_processing.py

import param_estimate as pe
import matplotlib.pyplot as plt
import numpy as np
import time
import pickle

filename = 'fitting_data'
save_data_series = [1,2,3]

for j in save_data_series:
pkl_file = open(filename+'_'+str(j)+'.par', 'rb')
SAVE_DATA = None
pkl_file.close()

fig = plt.figure(j)
ax = None
ax2 = None

# ------------------- EXPT DATA ---------------------------------=
ax.scatter(Xr,Yr,3,marker="x",c="g")

# ------------------- DATA FROM OPTIMIZATION ---------------------
color_scheme = np.linspace(0,1,len(SAVE_DATA))
for i in range(len(SAVE_DATA)):
data_store = SAVE_DATA[i]
n_set = data_store["n_set"]
t_set = data_store["t_set"]
p = data_store["p"]
mse_list = data_store["mse_list"]
print(" i = ", i)
print("   p = ", p )
if i==0:
this_label = "init curve"
else:
this_label = str(i)
ax.plot(t_set,n_set, color=(1-color_scheme[i],0,color_scheme[i]), label=this_label)
if i ==0:
this_title = "initial G,k1,k2,k3 = " + str(p)

plt.title(this_title)
ax2.plot(range(len(mse_list)),mse_list)
ax.legend()
ax.set_xlabel("t")
ax.set_ylabel("n")
ax2.set_xlabel("i")
ax2.set_ylabel("MSE")
plt.show()

print("\nClosing Post-Processing Program...")

param_estimate.py

# We are estimating the parameters of the following differential equations:
#   dn/dt = F(n, p) = G - k1 * n - k2 * n**2 - k3 * n**3
#   where p is the parameter (G,k1,k2,k3)

import numpy as np
import kero.utils.utils as ut
from scipy.integrate import RK45
import matplotlib.pyplot as plt
import csv

class DE_param_estimator:
def __init__(self):
self.learning_rate = 1e-21
return
# l2
def update_p(self,t_set_expt, n_set_expt, F, p_now, list_of_derivatives,
verbose = False):
# t_set_expt: list of numpy array
#   for example: [array([25.2047]), array([25.279]), array([25.3533]), array([25.4277]),...]
# n_set_expt: list of numpy array
#   for example: [array([789.696]), array([784.109]), array([763.528]), array([726.782]),...]
# p_now: list of parameters
#
#
# return
#   p_next: list of parameters
nabla_mse = self.nabla_MSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives)
if verbose>10:
# temp = self.learning_rate*nabla_mse
temp2 = np.transpose(np.matrix(p_now))
for_recommend = []
if verbose>30:
print("    nabla_mse = ")
ut.print_numpy_matrix(nabla_mse,formatting="%s",no_of_space=10)
print("    p_now =")
ut.print_numpy_matrix(temp2,formatting="%s",no_of_space=10)
det_list = []
with np.errstate(divide='ignore'):
for k in range(len(p_now)):
det = 1e-2*p_now[k]/nabla_mse.item(k)
if verbose>20:
print("    p_now[k]/(nabla_mse[k]*100) =",det)
det2 = np.log10(np.abs(det))
if (det2!=np.inf) and (det2!=-np.inf) and (not np.isnan(det2)):
det_list.append(det2)
if verbose>20: print("      det_list = ",det_list)
print("       recommended learning rate ~ ", 10**np.min(det_list))
p_next= np.transpose(np.matrix(p_now)) - self.learning_rate * nabla_mse
p_next = np.transpose(p_next).tolist()
p_next = p_next
return p_next
# L3
def dkdMSE(self, t_set_expt, n_set_expt, F, p_now, diff_part):
# lambda function : d/dk d/dt of n
# t_set_expt, n_set_expt : experimental data
N = len(t_set_expt)
dkmse = 0

def f(t,y):
return np.array([F(y,p_now)])

x0 = n_set_expt # array
time_checkpoints= t_set_expt
x_set, t_set, x_checkpoint, t_checkpoint_new = RK45_wrapper(x0, time_checkpoints, f, stepsize=None)

for i in range(1,len(n_set_expt)):
ni = x_checkpoint[i]
# print(" ni =", ni)
m = i+1
diff_term = 0
for j in range(m):
diff_term = diff_term + diff_part(x_checkpoint[j])
diff_term = diff_term / m
dkmse = dkmse + ( ni - n_set_expt[i])* diff_term
dkmse = 1/N*dkmse
return dkmse

def nabla_MSE(self, t_set_expt, n_set_expt, F, p_now, list_of_derivatives,
verbose = False):
# list_of_derivatives: list of lambda functions

dGdMSE = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives)
dk1mse = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives)
dk2mse = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives)
dk3mse = self.dkdMSE(t_set_expt, n_set_expt, F, p_now, list_of_derivatives)
# print("dGdMSE=", dGdMSE)
# print("dk1mse=",dk1mse)
# print("dk2mse=",dk2mse)
# print("dk3mse=",dk3mse)
nabla_mse = np.transpose(np.matrix([dGdMSE, dk1mse, dk2mse, dk3mse]))
return nabla_mse

filename = 'sample_data'
all_data = [[np.float(x) for x in y] for y in all_data]

X=[x for x in all_data]
Y=[x for x in all_data]

max_index = np.argmax(Y)
relevant_data= all_data[max_index:]
Xr = [x for x in relevant_data]
Yr = [x for x in relevant_data]

print("     data size = ", len(Xr))
print("     peak = ", X[max_index],",", Y[max_index])
if plot_data:
fig = plt.figure()
ax.scatter(X,Y,3)
ax.scatter(X[max_index],Y[max_index])
ax.plot(Xr,Yr,'r')
ax.set_xlabel("t")
ax.set_ylabel("n")
plt.show()

return Xr, Yr

# filename : string, name of csv file without extension
# headerskip : nonnegative integer, skip the first few rows
#   if all rows are to be read, set to zero

# example
# xx = [[np.float(x) for x in y] for y in xx] # nice here
# print(xx)
# print(np.sum(xx))

out = []
with open(str(filename)+'.csv') as csv_file:
count = 0
i = 0
count = count + 1
else:
out.append(row)
if get_the_first_N_rows>0:
i = i + 1
if i == get_the_first_N_rows:
break
return out

def MSE(Y_set,Y0_set):
# Y_set is a list of Y
# Y0_set is a list of Y0
# Assume Y_set and Y0_set have the same size, otherwise error will be raised
#
# Y (numpy matrix) is a vector: computed data from model
# Y0 (numpy matrix) is a vector: observed/true data
# Assume Y and Y0 have the same size, otherwise error will be raised
# Assume all entries are real

mse = 0
n = len(Y_set)
if not (n==len(Y0_set)):
print("MSE(). Error. Inconsistent dimension between Y_set and Y_set0. Return nothing.")
return
for i in range(n):
Y = Y_set[i]
Y0 = Y0_set[i]
Nout1 = len(Y.tolist())
Nout2 = len(Y0.tolist())
if not (Nout1==Nout2):
print("MSE(). Error. Inconsistent dimension between Y and Y0. Return nothing.")
return
# now compute the norms between true and model output values
for j in range(Nout1):
mse = mse + (np.abs(Y0.item(j)-Y.item(j)))**2
mse = mse / (2*n)

return mse

def RK45_wrapper(x0, time_checkpoints, f, stepsize=None):
# example
#
#
#
# def f(t,y):
# 	# dx/dt = f(x,t)
# 	return np.array([-1*y])
# x0 = np.array([1.0]) # initial value
# time_checkpoints = np.linspace(0,10,11)
# x_set, t_set, x_checkpoint, t_checkpoint_new = RK45_wrapper(x0, time_checkpoints, f, stepsize=None)

# t_checkpt_diff = []
# for t,t_new in zip(time_checkpoints, t_checkpoint_new):
# 	t_checkpt_diff.append(abs(t-t_new))
# print("max checkpt diff = ", np.max(t_checkpt_diff))

if stepsize is None:
stepsize = 0.01*(time_checkpoints[-1]-time_checkpoints)
t_set = [time_checkpoints]
x_set = [x0]
t_checkpoint_new = [time_checkpoints]
x_checkpoint = [x0]

for i in range(len(time_checkpoints)-1):
t1 = time_checkpoints[i+1]
if i>0:
x0 = ox.y
t0 = ox.t

else:
t0 = time_checkpoints[i]
ox = RK45(f,t0,x0,t1,max_step=stepsize)

while ox.t < t1:
msg = ox.step()
t_set.append(ox.t)
x_set.append(ox.y)
x_checkpoint.append(ox.y)
t_checkpoint_new.append(ox.t)
return x_set, t_set, x_checkpoint, t_checkpoint_new

def RK45_output_to_list(xx):
return [np.matrix((x.tolist())) for x in xx]