Artificial Intelligence and Applications In Physics -2: Solving differential equations along with Neural Networks - A new Paradigm ?
Introduction
In an earlier article, I had discussed how laws of physics were being derived using AI techniques. In that the primary question was if AI was capable of discovering the physical laws themselves. But consider a hybrid situation where there is some understanding of the physical law governing a system but one has to accommodate the data-driven insights without losing any information that the data could provide and be able to solve for the evolution of state-space. Is it possible ? To answer this we must consider the three general types of relations in physics
Physical Laws
These are well corroborated with numerous experiments and offer a full blown mathematical machinery to develop the analysis further. Examples include those based on conservation principles of charge and energy (more information here ). In the context of epidemiological modeling, this conservation principle is simply the person-count at any given time accommodating for new births and new-deaths. Physical Laws are applicable to ALL systems where they are relevant.
Constitutive relations
They approximate the response of a system to external stimuli. These are macroscopic mathematical relations having a wide applicability. These are invoked when there is no "fundamental" way of understanding the observed relations but rely on a huge number of correlations observed across multiple variables. Numerous examples can be given including the relation between stress-strain of a material ( Hooke's Law ), friction factor charts in flow, and modeling incidence rates in epidemiology . These relations are mostly observed as a general relation that is applicable for a wide range of cases with some variation only in the numerical constants associated. They are always constrained by the given domain of applicability and bounds of variation. For example, Hooke's Law applies only to those materials which are in the elastic zone. Constitutive relations are sometimes arrived at using dimensional analysis (Buckingham Pi-Theorem)
Empirical Relations
These relations do not have any physical basis. They are simply curve-fits but are handy in engineering predictions.
Can predict - Cannot Explain
Unfortunately both constitutive and empirical relations are seriously limited in applicability. Unlike the rigorous physical law based models which come from first principles, these two come under phenomenological models which quantify the relation between the variables involved but do not explain why the variables relate to each other as observed ? The empirical relations are especially worse. We may fit a curve to the observed relation but the curve-fitting itself presupposes some underlying relation thereby limiting the capture the of complete dependence between the variables (assuming here that the observations are the truth devoid of any measurement error ).
Sounds like a familiar situation in AI ? In the AI parlance we generally encounter this : Can predict - Cannot explain in many cases especially with Neural Networks.
I stumbled recently on an interesting paper by Chris Rackauckas of MIT titled "Universal Differential Equations for Scientific Machine Learning" . Chris Rackauckas has asked a simple question :
Given the framework provided by differential equations (deterministic or stochastic) based on conservation principles, is it possible to replace the constitutive and/or empirical relations with a Neural Network and solve them ?
Chris Rackauckas has answered this question thoroughly with both math and software. If one has to use an empirical relationship to model the behavior between certain variables, why not use the the universal approximation capability of the Neural Network and solved the differential equations ?
So what could be the real attraction in running behind a neural network rather than a simple polynomial curve fit which has a definitive form which facilitates easy interpretation ?
The real power of the neural network is in its ability to be an universal approximator for any function. While this does not mean we can compute the function to its exactitude, it offers a hope that there is a method which has - backed by theory - an ability to capture the complex dependence of variables in the multi-dimensional space without having to resort to any presupposition on the form of dependence. In fact, in a 2017 paper, it was shown that ReLU networks with width n+1 were sufficient to approximate any continuous function of n-dimensional input variables. By increasing the number of hidden neurons we improve the approximation. Of course, as the Neural Network approximation would smile back and say , "good you gave up your need for a form and I am formless"
Of course like any good research paper from Chris builds on the excellent work done by the predecessors including the work of Lagaris et al, Chen et al (check out this very nice video by Prof. Duvenaud ) and more with appropriate acknowledgements.
Applications of the combined methodology
This hybrid methodology has many interesting applications in different domains
Epidemiology
Currently the need to use the power of data analytics is most required in the fight against COVID ( Check out SAAL powered Department of Health Website on COVID for the latest numbers and trends ) .
Epidemiological modelling (next article on this) has greatly assisted in predicting trends thereby aiding capacity management and ensuring the spread is contained. Models can vary largely from deterministic mathematical models through to complex spatially-explicit stochastic simulations (e.g. GLEAM Project ) and decision support systems for hospitals. Whatever form they may be the objective is usually to predict
- how many will be infected
- how many will recover
- When will the spread stop ?
In this context, another team from MIT have published a paper which uses this approach and predicts the COVID trends for USA. The paper by Dandekar Et al titled "Quantifying the effect of quarantine control in Covid-19 infectious spread using machine learning" has defined a new quarantined state in the compartmental models already available in epidemiology using the above methodology. In a nut-shell they use an S-I-R-Q model where the quarantined population is used to evaluate the strength of quarantine function which is captured using a neural network. The neural network-augmented SIR ODE system was trained by minimizing the mean square error loss function that includes the neural network's weights W.
Battery Research
In most battery models, the state-space is modeled using a set of partial differential equations in time. Be it lithium batteries or ultrabatteries, when the battery is on load, the equilibrium cell-voltage of the system (informally and incorrectly called as Open Circuit Voltage OCV of the system) is that number which the system would exhibit at that state of activity of the reactants provided the current on the battery is zero. This depends on a number of factors including State-Of Charge, chemical potentials, age and the charge-discharge protocols of the battery. In Lithium batteries this is a complicated function of the state of intercalation / deintercalation of the lithium ion. This is usually curve-fitted and then used along with the charge, mass, species, energy and momentum conservation equations to model the state of the battery at any time. For example in 2014, the paper by Weng et al which was titled "A Unified Open-Circuit-Voltage Model of Lithium-ion Batteries for State-of-Charge Estimation and State-of-Health Monitoring" had the following empirical model proposed for OCV.
Clearly the authors had done quite a brain-squeeze to come up with the proposed-model which has 6 coefficients at least. Such empirical OCV relations must be coupled with balance equations (See for example : Governing equations for a two-scale analysis of Li-ion battery cells by Salvadori et al and many papers by Prof. Venkat Subramanian ) and a system of Differential-Algebraic-Equations are solved to predict the state-space evolution of the system. The current method using a neural-net along with the PDEs could greatly benefit the modelling methodology.
Conclusion
Summarily, using Neural Nets along with ODEs gives a mixed power of both some prestructured understanding of the system augmented with good predictive power. This could hugely benefit the scientific community as a hybrid technique
Venture Capital | CEO | Board Advisor | Chair | Non-Executive Director | Mentor | Clean-tech | Energy | Technology | Innovation
1yMurali, your article helped clarify my thinking on this area … which is so important. There is a progression in understanding “WHY” in the effort taken to correctly develop empirical relationships, through constitutive relationships, .., toward physical laws that builds to new fundamental or absolute understanding. If neural net AI is used to replace the effort of empirical and constitutive relationships the “burning drive” of WHY does the relationship have the derived characteristics may be lost and slow the speed of development of fundamental science and understanding be slowed. Of course the neural net AI could be trained to care about the WHY as much as being able to predict a resukt… but would it share the WHY? … and if AI is developed to put WHY before WHAT may the energy coming from discovery in human scientists feel meaningless. It is clear that in this case WHY and WHETHER of this transfer of the greatest drive for human advancement …the search for “WHY” itself … being passed on to AI will become mute… leading to WHAT…. Humanity superseded and demoralised in its search for WHY by superinteligenve… Perhaps all is left is to partner AI to immerse in creation… find the WHY in the joy of dappled things … WHEN
Enterprise Architect | Data Strategy | Digital Transformation | MBA | PhD (in Progress)
1yThank you for sharing, Murali! 😊
Chief Technology Officer
4yPardon me as I am not using correct terminology. Yes I mean establishing universal relationship through AI. We are doing one project with Schlumberger. They claim to apply AI on PVT data to generate more meaningful EOS. A generic, universal EOS is the dream of reservoir engineers. It can reduce computing time.
Chief Technology Officer
4yVery good treatment of the subject. Can we find universal relationships for phase separation, physical properties like viscosity in 2 phase and three phase systems then?