WEBVTT

00:00:00.000 --> 00:00:14.440
First, I would like to thank Enric for this invitation. I was here two years ago. I think

00:00:14.440 --> 00:00:20.240
it's the exact same room. It's great pressure to be back here. Looks like today we have

00:00:20.240 --> 00:00:30.680
SJTU invasion of Eilongan. Today I will share with you some work I did in the last few years

00:00:30.680 --> 00:00:39.640
in the direction of interacting particle system for both classical and the quantum systems.

00:00:39.640 --> 00:00:47.800
So this is what I try to cover. First, talk about the random batch methods for classical

00:00:47.880 --> 00:00:53.560
interacting particle systems. So the joint work with Lady, also my young colleague, and Jiangguo Liu

00:00:53.560 --> 00:01:01.160
at Duke University. Then one particular application which I think is important is

00:01:01.160 --> 00:01:06.720
in molecular dynamics. Then I will move on to talk about quantum and body problems and quantum

00:01:06.720 --> 00:01:15.240
molecular. So basically the problem I try to solve is very fundamental equations that everybody

00:01:15.400 --> 00:01:22.400
knows. F equals MA. So it's Newton's second law. So it takes the form of a system of ODEs. So here

00:01:22.400 --> 00:01:30.320
we have a bunch of M particles with positions Xi. For the I's particle, position Xi and the VI's

00:01:30.320 --> 00:01:36.760
is velocity. So the dots here are just time derivative. So the first line is the definition,

00:01:36.760 --> 00:01:43.320
right? DX, DT equals velocity. And second line is F equals MA. So here assume all the particles are

00:01:43.320 --> 00:01:50.320
exactly the same. So MA says is one. So the left-hand side is clearly acceleration. The

00:01:50.320 --> 00:01:56.200
derivative of time derivative of velocity is A, acceleration. The right-hand side is just F. So here

00:01:56.200 --> 00:02:05.280
F comes from interacting of particles. So particles, they have force act upon each other. What's

00:02:05.280 --> 00:02:12.240
important is the summation term. So each particle interacts with all the other particles. So here

00:02:12.240 --> 00:02:19.040
essentially we have M minus 1 summations. So this is usually called second order interacting

00:02:19.040 --> 00:02:24.880
particle systems, where phi is the potential. So minus negative gradient. A negative gradient

00:02:24.880 --> 00:02:31.960
of potential is just force. So it's force between the particles. Sometimes you take so-called

00:02:31.960 --> 00:02:37.120
over-damped limit, you get first order system. You limit the velocity, for example, in certain

00:02:37.120 --> 00:02:43.520
asymptotic regimes. Then you have this sort of long-distance equation, interaction particle,

00:02:43.520 --> 00:02:48.440
long-distance equations. You still have these interacting particles, potential force, K. K is

00:02:48.440 --> 00:02:54.840
interacting with force. We normalize it, divide it by M minus 1. And here V is the external

00:02:54.840 --> 00:03:01.520
potential. You have some external force. You get an external potential. You may also assume this

00:03:01.520 --> 00:03:10.640
noisy background. So these are the systems we try to solve. Talking about applications,

00:03:10.640 --> 00:03:16.800
it's everywhere. If you have a long enough career, there's a big probability you'll solve this

00:03:16.800 --> 00:03:22.600
equation. So here I just listed a few. Well, in physics and chemistry, molecular dynamics,

00:03:22.600 --> 00:03:29.560
for example. In astrophysics, where you have gravitational force between stars, galaxies.

00:03:30.120 --> 00:03:36.280
In biology, in the last two decades, people use so-called agent-based models. So you model

00:03:36.280 --> 00:03:47.120
interaction between different biological agents. And social science, economics, people they trade,

00:03:47.120 --> 00:03:56.120
for example. That's kind of interaction between different economic agents. Also, the particle

00:03:56.120 --> 00:04:03.600
method is also a numerical method to solve usually high-dimensional PDEs. It's called

00:04:03.600 --> 00:04:12.080
Lagrangian methods versus Eulerian, which solve PDEs. So here you solve PDEs by a particle method.

00:04:12.080 --> 00:04:17.780
Usually, for example, in kinetic equations and mean field equations, very high-dimensional PDE.

00:04:17.780 --> 00:04:25.040
So difficult here, actually, is the size of the problem. So here, notice that we have a

00:04:25.560 --> 00:04:33.440
big summation. And the number of particles usually is astronomical. For example, in cosmology,

00:04:33.440 --> 00:04:41.520
astrophysics, n can be as high as 10 to the 20. In plasmas, also 10 to the 20. The numerical

00:04:41.520 --> 00:04:46.560
particle methods, the largest computer you can do is 10 to the 9 to 10 to 12,

00:04:46.560 --> 00:04:53.080
than many other applications. So the difficulty here is that we have to add a lot of terms. So

00:04:53.160 --> 00:04:58.240
it's really the size of the problem. It's not that we don't know how to solve all these. Everybody

00:04:58.240 --> 00:05:04.240
here knows how to solve all these. But the problem is that this summation, you see is all the n

00:05:04.240 --> 00:05:11.320
summation for each i. But we have n of them, so it's all the n square, complexity of all the n

00:05:11.320 --> 00:05:18.040
square. That's the difficulty here. And n square usually is forbidden, sort of forbidden for class

00:05:18.040 --> 00:05:23.320
computation. And there are very famous algorithms developed to address this issue, for example,

00:05:23.320 --> 00:05:30.920
Fourier transform. And also, particularly for the problem I'm going to talk about, fast multiple

00:05:30.920 --> 00:05:39.680
methods introduced by Rocklin and Gringard. Those are two of the so-called top 10 algorithms of

00:05:39.680 --> 00:05:47.800
last century. So it's about how to reduce all the n square to all the n, n log n. So today I will

00:05:47.800 --> 00:05:54.240
show you a method we introduced. It's called random batch methods. So the goal is to reduce

00:05:54.240 --> 00:06:02.240
from n square, from n square to all the m. And actually it's a very simple method and also very

00:06:02.240 --> 00:06:09.920
general. It works for all fours, different fours, all different potential. So that's the job we work

00:06:09.920 --> 00:06:16.160
with Lei Li and Jiangguoliu. It's actually embarrassingly simple. Only two steps. I hope

00:06:16.160 --> 00:06:20.760
everybody can understand. You can go back to the program. I'm sure you can finish the program in

00:06:20.760 --> 00:06:27.680
half an hour. Last time I was in Changchun, it's normal university. There are two young faculty who

00:06:27.840 --> 00:06:34.200
saw advertisement in my talk. And then during my talk when I said a certain minute, he said,

00:06:34.200 --> 00:06:42.120
that's overstatement. It took them 15 minutes to program it. Anyway, so it's very simple. Two steps.

00:06:42.120 --> 00:06:47.760
So the first step, we do so-called random shuffling. So you have totally capital N particles.

00:06:47.760 --> 00:06:57.360
You divide them into small groups, but randomly. You divide the total N particles into little n

00:06:57.360 --> 00:07:04.680
groups. Each group has little p particles. But p times little n equals capital N. The key is here,

00:07:04.680 --> 00:07:11.360
the p, the size of each group has to be very small compared to capital N. It can be as small as two.

00:07:11.360 --> 00:07:16.320
It means basically two particles in each group. You just do a random shuffling. Like if you play a

00:07:16.320 --> 00:07:22.120
deck of cards or mahjong games in China, after each game you do a random shuffling and reorder.

00:07:22.880 --> 00:07:28.920
Group put into different groups. Then at each time step, every particle, each particle only

00:07:28.920 --> 00:07:36.640
interacts with particles in the same group, same small group. You forget other particles.

00:07:36.640 --> 00:07:42.240
You just interact among your own group, between your own group. That's it. That's it.

00:07:42.240 --> 00:07:52.080
Well, if you write an algorithm, you're probably smarter than the people that I talk about,

00:07:52.080 --> 00:07:59.560
maybe 10 minutes. Just a few lines. You pick up time step, tau is small. Assume that you have

00:07:59.560 --> 00:08:06.240
done M steps. So next step, what do you do? You partition the capital N particle randomly. It has

00:08:06.240 --> 00:08:12.120
to be a particular random partition into little n groups. And then in this dynamic system,

00:08:12.120 --> 00:08:18.960
the summation term, instead of adding all the terms, capital N minus one, you add p minus one

00:08:18.960 --> 00:08:25.840
term. So for each i, you only pick up those j in the same group as i. Then you add an average.

00:08:25.840 --> 00:08:36.520
Okay, that's it. That's it. Apparently the cost is now the n, because now the summation,

00:08:36.520 --> 00:08:43.320
instead of capital N minus one term, is p minus one, or p is like all the one. N can be astronomical,

00:08:43.320 --> 00:08:51.180
but p is all the one. A few, a few hundred, whatever. So it's essentially all the one

00:08:51.180 --> 00:08:57.540
summation here for each particle. So for n particle is all the n, right? And also this

00:08:57.540 --> 00:09:05.220
random shuffling, that of course you have to pay some price. And of course already in the 70s,

00:09:05.220 --> 00:09:11.500
they are already all the n algorithm to do this random shuffling. For example, if you use MATLAB,

00:09:11.500 --> 00:09:16.380
you just use this subroutine called, it's a random permutation, random permutation. You just call

00:09:16.380 --> 00:09:21.260
that subroutine. That's why it's only 10 minutes program, okay, even for grad students.

00:09:21.260 --> 00:09:32.020
Well, there's another version, okay? Instead of doing the random partition, for each step,

00:09:32.020 --> 00:09:39.060
I just randomly grab p particles and let them interact. Then I put it back. It's like you have

00:09:39.060 --> 00:09:46.820
basketball X, you randomly draw p X and let them interact and put it back. All right,

00:09:46.820 --> 00:09:51.780
then you randomly draw another p particles and let them interact and put it back. After little n

00:09:51.780 --> 00:09:59.460
steps, little n times p is capital N. After little n times you finish one time step, then you repeat.

00:09:59.460 --> 00:10:05.060
Okay, well this is different, little bit different than the other version because now at each time

00:10:05.060 --> 00:10:13.740
step, some particles may be picked more than once and some particles may be missed, okay? But as you

00:10:13.740 --> 00:10:19.860
accumulate in time, it's almost the same, okay, very similar. So both algorithms can be implemented

00:10:19.860 --> 00:10:27.060
very easily. Of course you may wonder, well, you know, I have been being very lazy, right? I only

00:10:27.060 --> 00:10:32.380
do part of the job. I didn't finish the work. So at each time I saved effort by all the n.

00:10:32.380 --> 00:10:43.020
Does it take n times longer to finish the job? Okay, intuitively if you only do part of job,

00:10:43.020 --> 00:10:46.940
you take a longer time, okay, n times longer to finish. If that's the case, then you didn't save

00:10:46.940 --> 00:10:54.980
anything, okay? But here what's important is that I'm going to prove that actually the time step can

00:10:54.980 --> 00:11:01.460
be independent of capital N. That's the key. I saved all the n effort for each time step,

00:11:01.540 --> 00:11:09.660
but does not take n times longer to finish the job. It sounds too good to be true. I'm telling you a

00:11:09.660 --> 00:11:18.580
way to be lazy, you can get away from it, okay? Don't do that in your work, research, but you know,

00:11:18.580 --> 00:11:26.020
this is the way, this is how you can cheat and you get away from it. Of course, I mean this idea,

00:11:26.020 --> 00:11:33.100
crazy idea, there's some inspired by some other ideas. For example, I mean here you have strong

00:11:33.100 --> 00:11:38.100
group in machine learning. We know if you do a deep neural network, at the end of the day,

00:11:38.100 --> 00:11:45.220
you have to solve a global minimization problem for a high dimensional non-convex functions,

00:11:45.220 --> 00:11:50.500
all right? So there of course the most popular algorithm is stochastic graded descend method,

00:11:50.500 --> 00:11:57.860
library knows. So you have a loss function which has many terms. At each gradient descent step,

00:11:57.860 --> 00:12:05.060
you just randomly pick up a mini-batch, okay? And then you do the gradient descent. Of course,

00:12:05.060 --> 00:12:11.500
number one, you save the effort. Number two, if it's random, so you add some noise. This noise

00:12:11.500 --> 00:12:19.580
will give you some escape probability to get out of the local minima. So give you some probability

00:12:19.580 --> 00:12:24.340
to get to the global minima, right? So that's a popular method. If you come from Boltzmann

00:12:24.340 --> 00:12:29.740
community, and the most popular method to solve Boltzmann is so-called direct simulation Monte

00:12:29.740 --> 00:12:35.140
Carlo, where you randomly pick up two particles and let them collide and of course there's some

00:12:35.140 --> 00:12:39.780
conservation law you have to satisfy, all right? It's called Nambuz algorithm or the Babi law,

00:12:39.780 --> 00:12:45.700
whatever. Of course, they're physically just two-part interaction, but in our method it's

00:12:45.700 --> 00:12:50.420
N-part interaction, but I probably randomly pick up a few. It's a little bit different,

00:12:50.420 --> 00:12:57.140
but you know, it's very similar. Okay, now I'm going to tell you, I have been being lazy,

00:12:57.140 --> 00:13:03.340
but I get away from it, okay? I'm going to prove theorem that it works. So to prove theorem,

00:13:03.340 --> 00:13:06.940
I have to make some assumptions, all right? Assume some regularities. For example,

00:13:06.940 --> 00:13:14.860
this external potential V, I assume is strongly convex, which means that V minus x square is still

00:13:14.860 --> 00:13:20.460
convex. Okay, here I have constant R here. It's more convex than R over two times x square,

00:13:20.460 --> 00:13:25.660
or some bounded derivative. I mean, you can allow the derivative to be grow, but polynomially.

00:13:25.660 --> 00:13:34.100
And now the interacting force K, we assume to be bounded, all right? Lipschitz with Lipschitz

00:13:34.100 --> 00:13:40.380
constant L and bounded second derivative. So here I assume this convexity coefficient R is

00:13:40.380 --> 00:13:46.620
bigger than two times this Lipschitz constant L, all right? So actually can prove, well,

00:13:46.620 --> 00:13:50.340
if I don't have the noise term, okay, it's just the interacting particle, then you can prove

00:13:50.340 --> 00:13:55.980
actually the L2 distance. If you start with two different initial data, they have two solutions,

00:13:55.980 --> 00:14:05.020
xi and yi. The L2 distance time derivative is bounded by itself, time this constant minus R

00:14:05.100 --> 00:14:11.860
minus two L. So under this condition R bigger than two L, then this coefficient is negative,

00:14:11.860 --> 00:14:18.580
which means that it's contracted. This dynamics is contracted, all right? So L2 distance actually

00:14:18.580 --> 00:14:24.660
decay exponentially to zero. Of course, I mean, this is boring. If you add the noise,

00:14:24.660 --> 00:14:30.300
the sigma is the strength of the noise. If you add the noise, actually the invariant measure or

00:14:30.780 --> 00:14:35.260
global agreement actually can be Gibbs measure. It's not true, it's Gibbs measure or

00:14:35.260 --> 00:14:42.940
Maxwellian and so on. Anyway, so that's the condition. So now assume the analytic solution is xi and

00:14:42.940 --> 00:14:48.420
random batch solution is xi tilde. Of course, error is the difference between the two, right?

00:14:48.420 --> 00:14:57.220
And it's a vector, so I take L2 norm and put the expected value because it's a random method,

00:14:57.620 --> 00:15:04.140
expected value. You can also measure error by so-called Wastenthal two distance. So here we have

00:15:04.140 --> 00:15:11.780
a stochastic difference equations, all right? So here you have a probability distribution for each

00:15:11.780 --> 00:15:19.540
x and xi tilde, it's mu and nu. And so you take this definition of Wastenthal distance, all right?

00:15:19.540 --> 00:15:26.220
So the capital pi, I mean, pi is the joint distribution of the two random particles,

00:15:26.220 --> 00:15:32.860
all right? So actually these two errors are basically equivalent. If you can prove one,

00:15:32.860 --> 00:15:40.140
you get the other. So this is a little theorem, okay? So now assume those are set conditions on

00:15:40.140 --> 00:15:47.580
V and K are satisfied, then the error is bounded by essentially square root of tau, where tau is

00:15:47.580 --> 00:15:55.260
time step. So it's half's order in time step. Why half's order? Actually what we have is actually

00:15:55.260 --> 00:16:02.380
the Monte Carlo method in time, or you probably know Monte Carlo method. If you have N sample,

00:16:02.380 --> 00:16:07.260
capital N samples, the error of Monte Carlo sampling is one over square root of N, so it's

00:16:07.260 --> 00:16:13.700
half's order in samples. Here actually we have Monte Carlo method in time. So the error is square

00:16:13.700 --> 00:16:20.220
root of time step, okay? But that's not very important. What's important is this constant C.

00:16:20.220 --> 00:16:27.260
You can prove under the condition that I stated, the C is independent of capital N. So what does

00:16:27.260 --> 00:16:33.900
that mean? It means, so for example, if I want the error to be like 10 to the minus 3, I just have

00:16:33.900 --> 00:16:41.340
to select the time step to 10 to the minus 6. Even my N can be 10 to the zillion, doesn't matter,

00:16:41.820 --> 00:16:50.780
so now you see my time step is independent of N, which means indeed I do not need N times longer

00:16:50.780 --> 00:17:00.820
to finish the work. So I cheat it and I succeed it. So that's our two error, and you get similar

00:17:00.820 --> 00:17:09.180
error for the vast and true distance between the empirical measure of the two systems. Also square

00:17:09.180 --> 00:17:13.340
root of C, that's square root of tau. So that's the key, time step independent of capital N.

00:17:13.340 --> 00:17:21.540
You wonder why this can happen, why this is true. Actually you compute the error,

00:17:21.540 --> 00:17:27.500
we committed error, right? So before you have this N term summation, N minus 1, the average,

00:17:27.500 --> 00:17:37.300
now I just take P minus 1 terms, okay? So that's the truncation error, right? The truncation error

00:17:37.300 --> 00:17:44.860
actually is all the one. Okay, I'm not doing the right job, it's all the one, clearly. However,

00:17:44.860 --> 00:17:50.380
if you take the expected value, you assume, okay, this is the chi, is the error, you assume,

00:17:50.380 --> 00:17:58.660
fix X, the expected value of chi is zero. So in average, I'm doing the right thing, okay? So

00:17:58.660 --> 00:18:06.500
instead of taking total average, I take a partial average, random partial average, then the expected

00:18:07.300 --> 00:18:12.420
error is zero. This is very useful, I understand. I mean, today, every day, I mean, there's a lot

00:18:12.420 --> 00:18:19.180
of news about Donald Trump. Before the US election, there are a lot of polls, usually they have a lot

00:18:19.180 --> 00:18:29.220
of polls. They don't poll all the voters because that's too many, usually it's more than 130 million,

00:18:29.220 --> 00:18:35.140
whatever. You don't call everyone, you just poll small samples, maybe 1,000. You get average,

00:18:35.940 --> 00:18:42.660
maybe Trump is 52%, Harry is 48%. There's some errors usually, okay, margin error, but it's small.

00:18:42.660 --> 00:18:50.340
So you see, you just take a random partial sum, partial average, they have a pretty good prediction

00:18:50.340 --> 00:18:58.780
of who is going to win. But it must be unbiased, you cannot just pull Texas, it's a red state,

00:18:59.020 --> 00:19:05.100
most people vote for Republicans, Trump. If it's California, it's blue, right? Usually they go for

00:19:05.100 --> 00:19:10.540
Democrats. You cannot do that. You have to close your eyes, you know, have a uniform, unbiased,

00:19:11.260 --> 00:19:17.100
then it works, okay? It works. So it's not only the philosophy of life, I mean,

00:19:17.980 --> 00:19:22.140
math is philosophy for your life. You can take a partial random average.

00:19:22.300 --> 00:19:27.420
Well, later we have some more sophisticated analysis, for example, second order systems,

00:19:27.900 --> 00:19:32.300
we can even allow disparate species and weights, not all the particles are the same.

00:19:33.740 --> 00:19:39.260
Also for a long time, people were wondering, when you have air, especially Monte Carlo air,

00:19:39.260 --> 00:19:42.460
I mean, the air we're accumulating time, eventually, of course, you get garbage.

00:19:43.980 --> 00:19:49.420
But it can prove actually, Jordan worked with Jen Anzoh, now he's at Westlake University,

00:19:49.420 --> 00:19:54.700
and he's still in the year, a brilliant student. So actually we can prove the same contract thing,

00:19:54.700 --> 00:20:01.420
it's a long time, the air is still screwed up, independent of T, okay? As long as it's contracted,

00:20:02.140 --> 00:20:08.380
you have the high causivity, so it's a long time, still accurate. Also, I want to talk about the

00:20:08.380 --> 00:20:15.500
area of the random batch, okay? You still have to discretize ODE, Olam or Riyama, whatever,

00:20:15.500 --> 00:20:19.020
you still have to give discretization. Even for fully time-discrete random batch,

00:20:20.220 --> 00:20:25.020
Jen Anzoh, he still now also proved that you still get similar air, okay?

00:20:28.620 --> 00:20:33.020
You can also analyze the mean field limit, if you basically define an imperial measure,

00:20:34.140 --> 00:20:39.180
basically put a direct mass at each particle, all right? And of course, when angle to infinity,

00:20:39.180 --> 00:20:42.700
you get the mean field limit, that's actually for plunking.

00:20:42.700 --> 00:20:48.140
So this drift term comes from external potential, and the interacting part

00:20:48.780 --> 00:20:53.420
gives this convolution, okay, it's the interacting force, all right? Then the noise,

00:20:53.420 --> 00:20:59.580
Brownian motion, give you this diffusion term. So actually, this method is not only

00:20:59.580 --> 00:21:07.100
efficient method to solve the original particle system, you can also view this as this particle

00:21:07.180 --> 00:21:13.420
method solved the mean field for plunking equation, okay? So this is original particle mu n,

00:21:13.420 --> 00:21:17.820
and this is random batch particle systems, because when time step goes to zero, it converges.

00:21:18.700 --> 00:21:22.620
This is the mean field for plunking equation, which is valid when n is very large.

00:21:23.820 --> 00:21:27.420
So this random batch method can be viewed as numerical particle method

00:21:27.980 --> 00:21:35.100
to solve the for plunking equation, okay? And you can, you can also use this method

00:21:35.500 --> 00:21:41.260
and you can, the error, if you do an error, it's the summation of a square of random batch error

00:21:41.260 --> 00:21:45.260
plus the mean field error. The mean field error is usually one over square root of n.

00:21:46.140 --> 00:21:50.780
So you see, if n is large, this is not only a good method to solve the particle,

00:21:50.780 --> 00:21:54.780
particle system is also a good method to solve the mean field equation, all right?

00:21:56.540 --> 00:22:00.620
Then of course, when we prove theorem, we have to make assumptions on the regularities,

00:22:00.700 --> 00:22:04.700
analyze functions, smooth functions. But practically, actually, this method works

00:22:04.700 --> 00:22:10.460
even for singular potential. So here we actually have a bunch of examples. For example, if you,

00:22:10.460 --> 00:22:15.180
this Dyson-Brownian motion, it's basically you want to find eigenvalues of a random matrix,

00:22:15.740 --> 00:22:20.700
right? So you can, you can solve that, find that eigenvalue by solving

00:22:20.700 --> 00:22:26.620
interacting particle systems. Here lambda j is the j's eigenvalue of this random matrix, all right?

00:22:26.700 --> 00:22:31.420
So this interacting particle system, you have external force for linear potential, essentially.

00:22:31.420 --> 00:22:36.700
Now the interacting force actually can be singular because it's the difference,

00:22:36.700 --> 00:22:41.340
one over difference between different eigenvalues. If two eigenvalues are close or

00:22:43.340 --> 00:22:46.460
multiple eigenvalues, I mean, this is a singular potential, right?

00:22:47.420 --> 00:22:51.020
This second example is the charge, this is a very famous physical problem, it's called

00:22:51.660 --> 00:22:57.260
Thompson problem, where you put the, like, electrons on the unit sphere,

00:22:58.220 --> 00:23:02.700
and they are subject to Coulomb interaction, all right? You ask, what is the equilibrium

00:23:02.700 --> 00:23:08.140
distribution? So that's called the Thompson problem. Actually, it has semicircle laws.

00:23:08.140 --> 00:23:12.700
So here again, you solve interacting particles with Coulomb force, here f is Coulomb,

00:23:12.700 --> 00:23:15.340
which is singular actually, so one over distance between particles.

00:23:15.340 --> 00:23:21.900
Okay. And this is the Brownian motion on sphere. This PS, the projection operator,

00:23:21.900 --> 00:23:29.500
the projection onto the unit sphere. And this is some crazy finance model where actually,

00:23:29.500 --> 00:23:35.660
it's very long-ranging interaction. Here the force increasing distance is not a physical, okay?

00:23:36.220 --> 00:23:41.020
And also you have a multiplicative noise here, the strength of noise-dependent solution.

00:23:41.020 --> 00:23:45.980
So all these examples works in America. Although the proof theorem is difficult, but it works.

00:23:48.700 --> 00:23:52.940
Well, after work, there has been several extensions. For example, in machine learning,

00:23:52.940 --> 00:23:57.660
people are interested in sampling of general distribution. One way to do it is to use

00:23:57.660 --> 00:24:01.980
Stein variation of Green descent. You can write it as interacting long-ranging equation.

00:24:02.700 --> 00:24:09.500
So this group at Duke, Jampung-Liu, and Jengu-Liu, and so on. You can also use this to compute like

00:24:10.220 --> 00:24:15.980
Gibbs measures that generate by interacting particle systems with singular potential,

00:24:15.980 --> 00:24:22.860
for example, a Rayleigh-Jones potential. Okay. Actually, Enrique has done some very interesting

00:24:22.860 --> 00:24:29.020
works on control of synchronization. Where you have synchronization, you can describe synchronization

00:24:29.020 --> 00:24:35.660
by particle system. For example, Coulomb model, which actually Coulomb model got the

00:24:36.460 --> 00:24:41.740
Bosman medal this year for his work for this model, synchronization model. So Enrique did

00:24:41.740 --> 00:24:48.300
a beautiful work on control of this using random batch. And with Song Yi Ha at Seoul National

00:24:48.300 --> 00:24:53.740
University, we also wrote a few papers on agent-based models for collective behavior, like

00:24:53.740 --> 00:25:01.580
like swarming, flocking, like cook smell models, whisting models, also current model models, and so

00:25:02.460 --> 00:25:07.500
But next, actually, I will mainly focus on important physical examples is molecular dynamics.

00:25:10.700 --> 00:25:16.940
So, so it's journal work with Chen Li Xu, he's also my colleague in the math department. So here

00:25:16.940 --> 00:25:24.060
we are interested in solving Coulomb interaction, okay, which has a very many important applications,

00:25:24.060 --> 00:25:31.260
for example, AF drug discoveries, AF of new materials. Okay. So here, this is the Coulomb force.

00:25:31.500 --> 00:25:36.060
Actually, in molecular dynamics, it's very complicated. You have terms corresponding to

00:25:36.060 --> 00:25:41.580
bounds, angles, torsions, you have a linear Jones potential, which actually is very singular.

00:25:41.580 --> 00:25:46.540
All right. And then you have Coulomb interaction. That's the most challenging part. Because the

00:25:46.540 --> 00:25:53.100
Coulomb interaction here, the qi, qj are the charges. All right. Then the numerator,

00:25:53.100 --> 00:25:57.500
the denominator is actually the distance between particle i and j. ij is distance between two

00:25:57.500 --> 00:26:01.660
electrons. All right. So clearly, this is all the n square summation.

00:26:05.660 --> 00:26:10.300
So the idea is based on so-called E-word splitting. So about 100 years ago,

00:26:10.300 --> 00:26:16.220
E-word has this beautiful, simple, but beautiful idea. So here is a long-range interaction because

00:26:17.180 --> 00:26:25.580
the force decays in one over r. So it's very slow decaying. All right. So if you want to

00:26:25.580 --> 00:26:34.060
throw away big, large r, r has to be very large to maintain small r. But E-word just write one over

00:26:34.060 --> 00:26:40.540
r into one of the air function. Air function is from zero to x of e to the minus x squared over two.

00:26:41.340 --> 00:26:46.620
Air function divided by r, and this is the one minus air function. So that's the identity.

00:26:46.620 --> 00:26:59.180
Right. And actually, the air function is near r, air function divided by r is smooth near r equals

00:26:59.180 --> 00:27:06.540
zero. Okay. So it's a removal singularity, r equals zero. So it's actually a smooth function.

00:27:07.180 --> 00:27:12.620
So for smooth function, you can take a Fourier transform. Well, one minus air function,

00:27:12.620 --> 00:27:17.500
complementary air function divided by r actually decay exponentially in r.

00:27:19.260 --> 00:27:23.500
So in fact, because decay exponentially, in fact, it's a short-range interaction.

00:27:23.500 --> 00:27:28.060
I'm going to throw away r when r is relatively large because it's decaying exponentially.

00:27:29.020 --> 00:27:36.620
So therefore, for this exponential decay term, I just add as it is a truncate r for a relatively

00:27:36.620 --> 00:27:40.940
large r. It's like short-range interaction, summation. There's a few terms of summation.

00:27:41.580 --> 00:27:49.740
For the smooth part, I put in the Fourier space, and there you have this e to the minus k squared term.

00:27:50.300 --> 00:27:53.900
All right. And that's like Gaussian. It's like Gaussian distribution.

00:27:55.900 --> 00:27:59.820
If it's Gaussian, here you have to add all the k's, infinite number k's.

00:28:01.340 --> 00:28:06.060
Because it's a Gaussian, so now I'm going to do, instead of adding all the k's, I'm going to do a

00:28:06.060 --> 00:28:13.020
random batch. I just select randomly a few k's. I'm not going to add all the k's.

00:28:13.900 --> 00:28:16.540
So the cost will be reduced from all the s squared to all the n.

00:28:18.300 --> 00:28:22.380
Another important thing is here, because this e to the minus k is a Gaussian sampling,

00:28:22.380 --> 00:28:27.660
Gaussian distribution, you know how to sample Gaussians, right? So you just sample the Gaussian

00:28:28.780 --> 00:28:35.980
offline. Beforehand, you take a lot of sample from Gaussian. You save it in the library.

00:28:36.780 --> 00:28:42.700
The each random batch step, you just pick up a few p terms from the sampling.

00:28:43.580 --> 00:28:49.580
This corresponds to important sampling. You know in Monte Carlo method, not only have to get average

00:28:49.580 --> 00:28:55.340
right, you have to reduce the variance. Because according to central limit theorem, the error

00:28:55.340 --> 00:29:00.140
comes from the variance. You have to find a way to reduce error variance. One way to effectively

00:29:00.140 --> 00:29:05.260
reduce the variance is important sampling. So by sample, pre-sample e to the minus

00:29:05.260 --> 00:29:11.340
the Gaussian, actually we implement a random Monte Carlo method with important sampling,

00:29:11.900 --> 00:29:19.500
which allows us to compute for a long time with very good accuracy. So it's a random batch plus

00:29:19.500 --> 00:29:28.860
important sampling. So here is the, we just pick up p term summations. We sample it priori.

00:29:29.420 --> 00:29:35.500
You put it in some library, then each time you just take a few terms, few samples.

00:29:37.820 --> 00:29:43.500
So for example, we did a numerical simulation of charge inversion in a salty environment where

00:29:44.540 --> 00:29:50.700
we have all atom simulation of like 17,000 water molecules. So if you compare the CPU time,

00:29:50.700 --> 00:29:56.220
actually if you do original e-world, it's like 15, 16,000. Random batch e-world, that's what we

00:29:56.220 --> 00:30:04.140
call it, with p equals 100. So here we totally have 17,000 particles, but each random batch will

00:30:04.140 --> 00:30:12.700
only pick up 100 particles. So the time is like one order magnitude smaller. And if you compare

00:30:12.700 --> 00:30:18.780
the cost, original e-world is this line here, while our case in the end, that particle is actually

00:30:18.780 --> 00:30:26.140
linear. So we compare p equals 10 to 100, compresses linear in n. And later, another

00:30:26.140 --> 00:30:33.100
colleague of mine, Hong Liang, who is a biophysicist, he learned our method, then he's a biologist. So he

00:30:33.100 --> 00:30:43.180
run big simulation. So he just exhausted SJTU's supercomputer with like 10,000 CPUs. So he did

00:30:43.820 --> 00:30:55.180
simulation of 10,000 molecules. Of course, the computer efficiency is one order magnitude faster.

00:30:56.780 --> 00:31:02.700
What's more amazing is the scalability. It turns out, if you have the state-of-the-art

00:31:02.700 --> 00:31:07.260
molecular dynamic simulation, for example, particle-particle-particle mesh method,

00:31:07.260 --> 00:31:12.060
it's called PPPM, or particle mesh e-world PME, those are the state-of-the-art

00:31:12.060 --> 00:31:16.620
molecular dynamic simulator that's being used for drug discovery in new materials.

00:31:17.740 --> 00:31:22.460
Those methods are all based on FFT. And we know FFT is not scalable.

00:31:23.420 --> 00:31:30.300
Okay, it's non-local. But this random batch method is scalable. You look at the parallel efficiency

00:31:30.860 --> 00:31:39.500
with 10,000 CPUs, PPPM has like 20% parallel ability. While the random batch here,

00:31:39.500 --> 00:31:46.780
we take 50, each batch is 50 or 100 particles. The parallel ability is almost 100. It's almost 100

00:31:46.780 --> 00:31:53.420
parallel. The reason is, when you have to parallel computing, the message passing,

00:31:53.420 --> 00:31:59.020
you have to pass the message from one CPU to the others, called message passing. That will

00:31:59.020 --> 00:32:05.500
eat up most of the cost. So the saving you have for each single CPUs by random batch is completely

00:32:07.580 --> 00:32:13.260
wiped out by the cost of message passing. But with the random batch method,

00:32:14.220 --> 00:32:18.700
each time you only involve a few particle interactions, so you only involve message

00:32:18.700 --> 00:32:26.140
passing among very few CPUs. So that significantly reduce the message passing cost,

00:32:26.140 --> 00:32:32.540
which allow us to maintain a parallel business almost 100%. So it's completely scalable,

00:32:32.540 --> 00:32:38.380
highly scalable. Of course, you did all the tests, all right? I mean, you did the one million time

00:32:38.380 --> 00:32:46.060
step from femtosecond to nanosecond, and maintain a very good accuracy compared with PPPM.

00:32:46.940 --> 00:32:53.020
With particles, each random batch is like a few hundred. I'm talking about 10 to the seven

00:32:53.020 --> 00:32:59.980
molecular dynamic simulation. Each random batch is 10 to the two. You see that they did all the

00:33:01.740 --> 00:33:07.980
benchmark comparisons for accuracies. And these are actually for hydrodynamic variables,

00:33:07.980 --> 00:33:13.740
like kinetic energy. Also you get nice agreement. And later, actually,

00:33:13.740 --> 00:33:19.420
more times they're very sophisticated. They're all different ensembles. This MVE, MPT, all these

00:33:19.420 --> 00:33:24.220
different ensembles. So this method has been extended to all these different ensembles.

00:33:27.660 --> 00:33:33.180
More recently, actually, we also use random batch not only for the long-range interaction,

00:33:33.260 --> 00:33:43.340
we also use it for short-range interaction. Now, actually, with one GPU, we can do 10 million

00:33:44.300 --> 00:33:51.020
all atom molecular dynamic simulations, only in one GPU. So that's our recent result.

00:33:54.780 --> 00:33:59.900
So with that, actually, well, I have an AI institute in Chongqing. It's beautiful.

00:34:00.060 --> 00:34:05.180
Chongqing is a very fascinating city you should visit. It's as fascinating as Shanghai, by the way.

00:34:06.940 --> 00:34:12.620
So we have, actually, an open source random batch simulator. It's in GitHub, if you are interested.

00:34:13.340 --> 00:34:18.060
Here is the address. And if you want to do a test, you just email this address.

00:34:19.020 --> 00:34:22.540
And, actually, we collaborated with a high-performance computer company in China,

00:34:23.180 --> 00:34:28.940
Shugang. And now we build our software, this random batch software, with 10 to 7

00:34:29.420 --> 00:34:36.540
atom molecular dynamics in one GPU. Here it is. So all you need is just have this equipment on your

00:34:36.540 --> 00:34:43.980
desktop. You can do molecular dynamics with 10 to 7 particles. Before, people could only do 10 to 6

00:34:43.980 --> 00:34:51.020
million scales. Now, 10 million in your office. You don't need a massive parallel computer.

00:34:51.020 --> 00:34:52.940
One GPU with this machine.

00:34:57.100 --> 00:35:01.100
Why is this very exciting all of a sudden? I'm very excited. I mean, now, you know, there are a lot

00:35:02.300 --> 00:35:06.300
big news about AI for science. The biggest, of course, is the

00:35:08.060 --> 00:35:14.540
Alpha Fold, which got a Nobel Prize last year, right? With Alpha Fold, actually, you can

00:35:15.500 --> 00:35:20.700
now predict very accurately the structural proteins, maybe all proteins on Earth.

00:35:21.500 --> 00:35:22.300
Proids sequence.

00:35:25.180 --> 00:35:30.380
There was a joke. I mean, there's some very famous Chinese scientists, biologists, who actually,

00:35:30.380 --> 00:35:34.780
before, used this, what is called, I forgot the English, some, some,

00:35:37.420 --> 00:35:41.420
anyway, I mean, they used this very expensive equipment. You know the English?

00:35:42.460 --> 00:35:50.220
What's the English for this? I forgot. Anyway, so they used to do experiment, all right? Before,

00:35:50.220 --> 00:35:55.580
in old days, you need one piece of decision to find the protein one, one structure, one protein.

00:35:55.580 --> 00:36:00.540
Now, of course, Alpha did them all. So this experiment is that people are joking you're going

00:36:00.540 --> 00:36:07.580
to be out of a job, right? So the Yan Ning actually is a household name in China in this business.

00:36:07.580 --> 00:36:13.420
So people are joking, okay, you have no job to do. So this is what she said in blocks. She likes

00:36:13.420 --> 00:36:18.940
red blocks. She's internet sensation. She was in prison. Now she came back to China.

00:36:20.220 --> 00:36:25.180
Had a big research institute. This is what she wrote. Of course, she said, I was gracing

00:36:25.180 --> 00:36:31.580
Alpha Fold and protein. But she said, well, compared with the contribution to protein folding,

00:36:31.580 --> 00:36:36.700
protein folding, I hope that AI can help more molecular dynamic simulation for structural

00:36:36.700 --> 00:36:43.340
biology. This field is in urgent need to progress. Okay. But that's exactly what we did, right? You

00:36:43.340 --> 00:36:49.420
have a very fast molecular dynamic simulation technology, which is a very important part for,

00:36:49.420 --> 00:36:55.820
if you want to track design, you have to do this free energy perturbation where you need to do a

00:36:55.820 --> 00:36:59.980
lot of molecular dynamic simulation also for new material designs. So I think it's a very

00:37:00.620 --> 00:37:06.140
important part in AI for science where you have to do a lot of simulation for F equals MA.

00:37:08.140 --> 00:37:09.420
All right. Any questions so far?

00:37:09.420 --> 00:37:17.820
Can I ask a question? So the number of particles is capital N, right?

00:37:17.820 --> 00:37:18.620
Yes. Yeah.

00:37:18.620 --> 00:37:21.660
And you take batches of size P.

00:37:21.660 --> 00:37:22.460
P, yeah.

00:37:22.460 --> 00:37:25.260
So what is the, is there an optimal source?

00:37:25.260 --> 00:37:30.620
Yeah, that's a good question. Well, smaller P, of course, is more efficient,

00:37:30.620 --> 00:37:35.580
but it increases the variance. Actually, we have a, you can compute the variance. Variance is

00:37:35.580 --> 00:37:41.900
proportional to one over square root of P minus one. So increase P will reduce the variance.

00:37:42.460 --> 00:37:49.420
So it's a balance between efficiency and air, of course. But for this molecular dynamic simulation,

00:37:49.420 --> 00:37:54.140
usually we pick P to be like 10 to the two, while capital N can be 10 to the seven.

00:37:54.940 --> 00:38:02.060
So that usually is good enough. But we don't have a good accuracy analysis for cooling

00:38:02.060 --> 00:38:05.420
interaction because it's very single. It's difficult. But for smooth interaction,

00:38:05.420 --> 00:38:06.860
we have this air estimate.

00:38:14.860 --> 00:38:18.860
Yeah, yeah, sure. Sure. We compare, we usually take P to the 10 to the two.

00:38:18.860 --> 00:38:24.940
So it's like 10 times faster. Yeah. This question back there.

00:38:32.060 --> 00:38:44.940
Yeah, yeah, yeah.

00:38:49.740 --> 00:38:53.660
Well, okay, if you want to do drug design, the candidate

00:38:54.300 --> 00:39:05.180
molecules tend to the 60. Okay. But with AI, you can easily identify maybe very quickly,

00:39:05.180 --> 00:39:11.500
instead of 10 to 60, maybe a few hundred possible molecules for, with probability to make the drugs

00:39:11.500 --> 00:39:19.260
you want. So this AI-aided drug design can accelerate before pre-clinical trial stage,

00:39:19.260 --> 00:39:22.060
before it takes years. Now it just takes a few months.

00:39:37.180 --> 00:39:38.140
Yeah.

00:39:38.140 --> 00:39:42.940
There is not a way to fix like the batches. I mean, I can imagine that the year of the

00:39:42.940 --> 00:39:48.380
revolution or year of this logo, then it doesn't make too much sense to pick them up.

00:39:48.380 --> 00:39:53.660
Sure, sure. This works for long range. Yeah, yeah. Short range, of course, is easy, right?

00:39:53.660 --> 00:39:59.020
Yeah, yeah, yeah. And therefore, it doesn't make sense that it fits the size of batches.

00:39:59.820 --> 00:40:05.580
Yeah, that's a good question. I mean, we don't have exactly, it's kind of empirical, right? We

00:40:05.580 --> 00:40:13.420
just pick a hundred, five hundred. They're very similar. Of course, how do you sample?

00:40:13.420 --> 00:40:17.900
That's tricky business. Here in molecular dynamics, we have important sampling according to

00:40:17.900 --> 00:40:24.380
the Gaussian. But for other force kernel, you might have specific way to sample instead of

00:40:24.380 --> 00:40:30.460
a uniform distribution. If you have a, for example, if you look at the Lennard-Jones potential, which

00:40:30.460 --> 00:40:36.940
is very similar near the origin, we actually, we split this kernel into a smooth part, which

00:40:36.940 --> 00:40:43.340
we use a random batch, and the singular part where we just, we treated like a short range

00:40:43.340 --> 00:40:49.020
interaction. Yeah. So you play some tricks for specific potential. Yeah.

00:40:49.020 --> 00:41:07.660
Yeah, yeah. High dimensional PDEs. We have to use particle method, yes. Yeah, one example,

00:41:07.660 --> 00:41:13.900
we use this for long-dow equation. I mean, that's kind of Boltzmann, but for plasma with glazing

00:41:13.900 --> 00:41:21.420
limit, with Jose Carrillo at Oxford, actually, this collision term, collision is like summation,

00:41:21.420 --> 00:41:27.820
right? And it discretizes the summation. So we actually use it there, which can reduce the

00:41:27.820 --> 00:41:30.780
efficiency significantly because of the vulnerable integral. Yes.

00:41:33.500 --> 00:41:39.180
Yeah, every time you have a summation, remember from now on, okay, I tell you how to cheat,

00:41:39.900 --> 00:41:45.500
successful cheating. Every time you see big summations, average, you just do a partial

00:41:45.500 --> 00:42:00.540
summation. It always works. Yeah, that's why you have an integral representation of it. I never

00:42:00.620 --> 00:42:08.220
tried it. I mean, I think most time when you have seen, okay, one, usually works for

00:42:08.220 --> 00:42:14.540
dissipative systems. Does not work for like oscillatory system. Okay, if you have oscillations,

00:42:14.540 --> 00:42:18.620
oscillation usually you need a very high resolution, and this dissipation kills

00:42:18.620 --> 00:42:23.100
oscillation. That's not good. But as long as it dissipates the system, then usually it works.

00:42:23.100 --> 00:42:29.660
So that's my experience. Yeah, yeah, yeah. Even it's fraction diffusion, I expect it to work,

00:42:29.660 --> 00:42:33.660
although I never worked on it. The reason I didn't work it is because there are too many

00:42:33.660 --> 00:42:37.180
works on that I don't want to jump into. I usually don't want to jump into a busy field.

00:42:39.100 --> 00:42:44.860
Yeah, I think it's probably work, but we never, we should probably try it. Yeah, thanks. Okay,

00:42:44.860 --> 00:42:49.020
now let me get to the quantum part. Okay, so I was talking about a classical particle system,

00:42:49.020 --> 00:42:54.620
but now the first system, so it's joined, the first, this is joined work with two distinguished

00:42:54.620 --> 00:43:00.460
French mathematician, Frans Foucault, he co-op technique and theory poet at Sorbonne University.

00:43:00.460 --> 00:43:04.140
So here we have an M-body Schrodinger equation, which is the ultimate physical equation we want

00:43:04.140 --> 00:43:08.940
to solve. Maybe now let's talk about quantum computing, that's designed for this equation.

00:43:08.940 --> 00:43:13.900
But anyway, so the M-body Schrodinger equation, the difficult here is of course,

00:43:13.900 --> 00:43:18.060
curse of dimensionality, because here you have M particles, each particle has three,

00:43:18.060 --> 00:43:23.740
it's three dimension. So it's a PD of dimension three N, usually N is few hundred.

00:43:25.100 --> 00:43:28.220
Of course, here's the Hamiltonian has two part, this kinetic part is basically

00:43:28.220 --> 00:43:33.180
a plush and for each particles. All right, then I have this cooling interaction between particles.

00:43:34.460 --> 00:43:38.620
This of course is so-called first principle simulation because there's no,

00:43:39.420 --> 00:43:43.340
no free parameters. Okay, if you can solve it, you understand everything about the physical world.

00:43:43.340 --> 00:43:48.540
You can design all kinds of drugs you need for example. So here of course,

00:43:48.540 --> 00:43:53.580
this interaction part, we can easily use random batch. So we have so-called random batch Schrodinger,

00:43:54.220 --> 00:44:00.140
where instead of add all the terms, basically, we just add one term for each particle i,

00:44:00.140 --> 00:44:05.500
we randomly pick another particle j. Okay, so this summation, we get rid of summation,

00:44:05.500 --> 00:44:09.900
that's called a random batch. Of course, this is a theoretical result, we cannot,

00:44:09.900 --> 00:44:13.660
we don't have a computer to solve this equation. Okay, we cannot use this for practical purpose,

00:44:13.660 --> 00:44:18.860
but maybe in the future. But for now we have theory. So the theory says the following.

00:44:20.540 --> 00:44:24.940
And of course, for theory, you have to use a von Neumann equation, which is basically an

00:44:24.940 --> 00:44:31.100
equation for density matrix. So here are N's density matrix, where this bracket is just

00:44:31.100 --> 00:44:37.500
commutator of these operators. So there we have a random batch von Neumann equation.

00:44:38.460 --> 00:44:42.220
So this is what we prove. We have to assume, make some assumption on the potential,

00:44:42.860 --> 00:44:49.500
symmetry and decay. Essentially what we find is that this W H is the Wigner transform.

00:44:51.740 --> 00:44:57.100
And R N Y is first marginal. So of course, here we have to take first marginal. It does not work for

00:44:57.100 --> 00:45:03.500
wave function. It works for first marginal. The tilde is a random batch first marginal,

00:45:03.500 --> 00:45:08.300
you have to expect the value. So difference between the random batch first marginal,

00:45:08.300 --> 00:45:14.220
it's a Wigner transform and the original analytic solution. Well, this is a dual norm.

00:45:17.260 --> 00:45:25.900
It's bounded by some constant independent of N, also independent of H bar, because H bar can be

00:45:25.900 --> 00:45:32.140
very small. It's a very stiff problem. It's highly oscillatory. So the error is linear,

00:45:32.140 --> 00:45:36.300
it's a weak norm. It's a linear first loading time step and all the constant are independent of N,

00:45:36.300 --> 00:45:41.340
independent of H bar. It's a uniform error. So here, this is the definition of dual norm. Basically,

00:45:41.340 --> 00:45:48.140
you have a F minus M norm, essentially, you multiply by essentially C M function. That's

00:45:48.140 --> 00:45:53.660
the dual norm. So it shows that it also works for quantum and body problem.

00:45:57.020 --> 00:46:00.220
Except that we of course, we don't have a numerical simulation to show it works because

00:46:01.340 --> 00:46:06.700
it's something for the future, a theory for the future. Then we sent out Lee and Penn State.

00:46:07.340 --> 00:46:12.700
Actually, we also extend this idea to so-called quantum Monte Carlo. So that said, you cannot

00:46:12.700 --> 00:46:18.060
solve, cannot afford to solve M-body Schrodinger equation, but people usually do in quantum

00:46:18.060 --> 00:46:22.540
chemistry is to find the ground state, which is the minimum eigenvalue of the Hamiltonian,

00:46:22.540 --> 00:46:28.220
quantum Hamiltonian. So there are different ways to do it. One way is to use Markov-Chamberlain

00:46:31.100 --> 00:46:37.580
hosting Metropolis algorithm, for example. You can also solve this equation by solving

00:46:37.580 --> 00:46:44.300
interacting particle Longi-Wang equation. So here are ice particles. So you have external

00:46:44.300 --> 00:46:48.540
potential, which is log of phi. Then you have this cooling direction. Then you have some

00:46:48.540 --> 00:46:55.740
white noise. You have a white noise term. So the equipment solution of this dynamic system

00:46:55.740 --> 00:47:01.340
gives a distribution of the ground state. It's called a variational Monte Carlo method.

00:47:02.060 --> 00:47:07.900
So here, of course, now whenever you see summation, I just pick up a random one term.

00:47:07.900 --> 00:47:14.700
So for particle i, I just pick a random j, and j of having the intervals i. So you see now I have

00:47:14.700 --> 00:47:21.820
this algorithm without the summation term. Basically, I pick p equals two. The real-life

00:47:21.820 --> 00:47:30.620
equation has two particles. So you move like 400 Markov chains for 1,000 steps. If you just do

00:47:31.180 --> 00:47:37.980
Metropolis, the CPU time is like 1,500. If you solve this Longi-Wang equation directly using

00:47:37.980 --> 00:47:44.540
Ola-Maruyama, it's about 460, where with the random batch, it's only 50 units.

00:47:46.780 --> 00:47:52.540
This more general version for quantum Monte Carlo is called Diffusive Monte Carlo. So here, instead

00:47:52.540 --> 00:47:59.500
of solving a short equation that has i here before d dt, you just delete the i. It becomes

00:47:59.500 --> 00:48:04.220
heat equation. The reason is heat equation converges because dissipity converges to

00:48:04.220 --> 00:48:08.060
equilibrium solution much faster. The Schrodinger equation is conservative.

00:48:09.100 --> 00:48:14.700
So you can also solve this by some Longi-Wang equation. So this is a force term. You have

00:48:14.700 --> 00:48:20.460
external force. You have cooling interaction. And actually, in the implementation, there are terms.

00:48:21.180 --> 00:48:25.980
You not only have binary interaction, you have a triple interaction. You have summation with

00:48:25.980 --> 00:48:31.260
three indices. But it doesn't matter. This not only works for two-body interaction. It works for

00:48:31.260 --> 00:48:39.180
any body interaction. Here, with summation with three indices, you just pick a random triplet,

00:48:39.180 --> 00:48:47.420
ijk. Then you delete the summation. So actually, you can carry it. If it's original, if you do

00:48:47.500 --> 00:48:55.980
direct Diffusive Monte Carlo, it's quadratic in M, but now in our case, it's linear in M.

00:48:59.260 --> 00:49:04.380
Actually, so Eytam-Tenemann and Jose Carrillo, they also with Nicola

00:49:08.860 --> 00:49:16.220
Bellemol. So they is editor chief of M3AS. So they added this book series for active particles.

00:49:16.780 --> 00:49:20.860
So they invited me and Lady to write a review article for random batch. So I'm sorry,

00:49:20.860 --> 00:49:25.500
this is in Chinese. So it's extension to a market dynamics quantum Monte Carlo,

00:49:26.540 --> 00:49:32.300
many body short equations, social and biological collective behavior, sampling problem in

00:49:32.300 --> 00:49:39.500
statistic and machine learning. So these are the applications. This work also was selected

00:49:39.980 --> 00:49:45.740
every year at SJTU, select top 10 Advancing Science Technology. So this work also was

00:49:46.540 --> 00:49:52.620
in one of the years. And also last year was, we got first class Shanghai Natural Science Prize

00:49:52.620 --> 00:49:58.700
for this work. Okay, here's conclusion. So basically, I probably agree with me that

00:49:59.500 --> 00:50:03.260
Newton's equation, the Schrodinger equation are two most fundamental equations in physics.

00:50:03.980 --> 00:50:10.060
Okay. So when you have M particles, computational cost is very expensive. So for both,

00:50:10.940 --> 00:50:15.980
we introduce a random batch method to reduce the cost from all the S squared to all the M.

00:50:16.940 --> 00:50:20.620
Very simple. The method is very simple. Compared to the first model, this is extremely simple.

00:50:21.580 --> 00:50:28.300
Never, I mean, theoretically for classical particle, we can prove S, S mate, okay,

00:50:28.380 --> 00:50:33.260
independent of N. For quantum dynamics, quantum mode, actually we can prove

00:50:34.140 --> 00:50:40.300
error. And that's for first marginal with error independent of N and also H bar. So it works

00:50:41.260 --> 00:50:47.580
all the way into some classical region. And then for, we extended, we used it for classical

00:50:47.580 --> 00:50:52.780
market dynamics based on random E word, random batch E word, and also quantum Monte Carlo.

00:50:53.420 --> 00:50:59.180
I mean, the cost is all the N and more important, it's highly scalable. Okay. It's highly parallel

00:50:59.180 --> 00:51:05.820
algorithms. Of course, there are a lot of theory and practice you can, along the way. For example,

00:51:06.460 --> 00:51:12.780
can you prove S mate with less regularity? So here we assume a nice regularity to get a theorem,

00:51:12.780 --> 00:51:19.420
but you have Coulomb, for example, and those. And also for application, for quantum Monte Carlo,

00:51:19.420 --> 00:51:25.500
we can only do bosonic system. We don't know how to do a fermionic system. For fermionic system,

00:51:25.500 --> 00:51:30.540
I mean, this density function is not symmetric. If you switch to particles, you get a minor sign.

00:51:31.180 --> 00:51:35.980
So you have to use a slatted determinant. And we don't know how to do that random batch for this

00:51:35.980 --> 00:51:42.620
kind of setting. But most exciting applications we are currently doing, you know, China's leader

00:51:42.620 --> 00:51:47.740
of new energy vehicles and new energy batteries. All right. So this company,

00:51:47.740 --> 00:51:53.820
CETL, is the leading company for batteries, new energy batteries. So currently we're working with

00:51:53.820 --> 00:52:00.780
them to do more dynamics for listen battery design. So last week, actually, some colleagues

00:52:00.780 --> 00:52:05.580
were visiting them. So they're very interested in using this for the future new energy battery

00:52:05.580 --> 00:52:07.740
design. So with that, I stop. Thank you.

