Video

Super-Human Operator: Controlling Accelerators with Machine Learning

Public lecture presented by Auralee Edelen on October 1, 2019

I'm introducing orally edelyn she will be giving a talk on controlling

accelerators with machine learning and I'm gonna tell you about her background and when you hear about it you would

think that right when she was applying for her undergraduate she was planning this talk because she charted a course

that so she did her undergraduate degree in physics from Rensselaer Polytechnic

Institute with minors in several disciplines philosophy science mathematics psychology and science

technology and society after graduating she became a research engineer in the

theory modeling and analysis branch of the Naval Surface Warfare Center in Bethesda Maryland where she worked on

the complex problem of reducing the electromagnetic signatures of submarines how cool is that

in 2012 she brought her interest in complex systems optimization and machine

learning to Colorado State University where she became interested in applying her skills to particle accelerators for

her PhD she used neural networks to model and control the accelerator system

at the Fermi National Accelerator lab after completing her thesis aura Lee

joined slack as a research associate in the accelerator research division where she continues to work with a growing

group of co-workers and collaborators on applying machine learning to practical as accelerators so without further ado

thanks everyone for being here tonight this is a really fun opportunity to kind of share some of the joy that I find in

working in this area with all of you guys so I am gonna start okay

I'm gonna start by telling you a little bit about my background in the hopes

that as I'm doing this I can also talk about some of the concepts that initially got me interested in this area

to sort of spark some of the sort of frame of mind and imagination that comes

around with a lot of this so I grew up in the middle of nowhere in upstate New York which means that I spent a lot of

time running around outside wondering about the world around me and I think this is a big reason why I got

interested in science you know it was just fun to wonder about how all these different things worked um when I was in

high school I was already sort of interested in astronomy and physics and there was a science essay contest that

came up and as part of this there was a cash prize and I wanted to write the

essay but I also wanted that cash prize so that I could go out and buy a telescope so that's what I did I won the

contest I got this telescope and then I used it in my family's field often times

in the cold winter months when the viewing is really clear and I remember

distinctly this first time that I was wandering out into the field the snow is glistening all around me I had this

telescope with me and I could look up at all these stars and globular clusters

and nebulae and I remember thinking to myself how cool that was that I'm here

on this tiny planet that has a socio-economic system where I could write some words some well-thought-out

words when a contest used the reward from that to buy a telescope that now

enables me to look at things there are thousands of light-years away and on top

of that using a tool that you know to hundreds of years of collective engineering to build so that that kind

of way of thinking I guess is sort of the common thread in all this and when I

went to college I started getting really interested not just in physics but

specifically in complex systems so these are things like the super structure of

the universe where you have all these interactions between different centers

of mass and evolves over time in these sorts of really interesting ways I was

interested in the brain how is it that we have all of these neurons that communicate and somehow we you know take

all these collective effects from these small interactions and wind up interacting in the world in a reasonable

way things like slime molds which you know can actually solve complex optimization

problems pretty bizarre and even just things like social networks what are

these different clusters of connections that this person has and the common

theme in a lot of these systems is that they're really difficult to predict what's gonna happen right so when we're

trying to model the earth and how things evolve on the earth in terms of weather

or like long-term trends like co2 content this gets really hard to predict

I mean to the point where you need to run thousands of hours on a supercomputer to be able to do so so

this was all really cool and I think it at some point when I was working in DC

sort of evolved from this this idea of just trying to understand and predict

these sorts of things to what happens when you actually need to take a really complicated system and control it so

something like a tokamak where you're trying to combine or confine a really hot plasma and get some energy out of it

it's like a really hard really interesting problem and to kind of sort

of highlight how complicated this can become let's take a step back to the

more mundane things you all have the experience probably of stepping

to a hotel bathroom and you have you want to take a shower and you have no

idea really how to adjust the knobs on the shower so that you're not scalding yourself or freezing yourself and at

first you're just kind of making small changes seeing how things improve but

you're actually doing something more interesting than that - as you're doing this you're also learning some model of

how this particular shower system works and in your own home you know you

probably know exactly where you need to set the knobs to get just the right temperature for you so that's a really

simple control example and I thought that was kind of it was kind of

fascinating to think about this intersection between trying to optimize

a system or control it and also building up these learned models of it while you're doing that to improve control and

at some point I became aware of accelerators and what a great kind of

playground these are for this for looking at this kind of problem so not

only are they huge you also have many parameters that you need to adjust

hundreds to thousands in some cases and it's becoming more challenging to

control these systems as we start going to more advanced kinds of accelerating technologies right now a lot of people

are trying to accelerate beams and plasma so it's a pretty amazing thing really not linear there are also a lot

of really great use of practical uses of these machines reasons why you would

want to have fine control right so for instance we want to be able to use

accelerators to help treat cancer and that requires fine control over the beam so this whole is really just a very rich

area to study this kind of this kind of task in the case of the LCLs we're

dealing with beams electron beams that are moving very very quickly close to

the speed of light just to give some some context the energy gain of every single electron of

the beam is the equivalent of getting through using thirty three billion double-a batteries to to accelerate that

beam and the beam can be extremely intense so this is kind of a cool thing

that's really unique instrument and we can use it to do things like study

photosynthesis how does this work how might that help us understand these systems better but also maybe build

systems that take advantage of that how can we understand protein folding and

maybe that can help with you know drug design maybe we can also look at

chemical reactions and how they're occurring and use this to develop new new approaches to making different kinds

of chemical reactions occur and then we also can study things like really

intense conditions like what you might see on this on an accretion disk in a

black hole for instance so this is the kind of science that this facility

supports and it's really heavily used in 2016 there were just over a thousand

user experiments and since the facility was built in 2009 there have been just

over a thousand papers published and people wanting to run experiments that

help elucidate these sorts of things come very frequently they come for a few

days to a week and they have specific electron and photon beam properties that

they need in order to do this science so how do you actually support that in fact

what we have is an accelerator that's very flexible so we have a laser that

comes in and produces an electron beam we accelerate it through different

accelerating cavities we have to compress the beam which we do with the series of magnets focus the beam and

then eventually once it's sort of gotten to the right shape at the entrance of the angelator we wiggle it back and forth to produce

to produce light which is then sent to at this point in time seven experimental

stations and in order to meet the needs of these different users we actually

have to span kind of a wide variety of beam shapes so here I'm just showing different kinds of electron beam shapes

that we might deliver to different users in terms of how time-consuming this is

over the past few years around 400 hours have been spent tuning per year and

intensive tuning is done on average two to five times per day and on average

this takes about 30 minutes to do for a new user to put some numbers on this the

approximate annual budget of LCLs is about one hundred and forty five billion dollars and on on average we've had

about five thousand hours of experiment delivery per year and so this means that

it costs about thirty thousand dollars per hour per experiment so what this

means is that tuning up the machine efficiently to meet these kinds of challenging beam conditions is really

important so in a perfect world how do we do this well if we had some

accelerator model and we could just send

a bunch of different adjustments to these various things that we can tune we

would get a perfect prediction and we could just apply it to the real machine in reality that's obviously not the case

we have lots of fluctuations that we can't control for instance the initial laser spot can fluctuate a lot we can a

pretty dramatic slow drift over time we're here we're actually operating a

machine that relies fundamentally on an instability and then on top of this even

if we wanted to run an optimization on our simulation of the machine in reality

this is infeasible generally be we have really computationally intensive

simulations that are needed to actually replicate the machine behavior well in addition to that we also often don't do

that well matching our simulations to measured data so there can be a big gap

here so in practice what do we do we've relied a lot on human operators to tune

the machine and so they're doing all sorts of complicated tasks these are really highly skilled people who have a

lot of experience both with accelerator physics and with the individual machine that they work on and for instance

they'll be doing things like looking at time series data that's constantly streaming on the fly they'll be building

up these internal mental models both of how the machine behavior looks and also

you know getting a feel for if I changed the snob what can I expect to see occur

given some overall collected machine state that we're in they're also

oftentimes doing kind of complicated image analysis right looking at images in the beam so there's a lot is like

here that operators are doing and oftentimes they'll also be sort of

constantly tweaking the machines so this is kind of analogous to this shower example that I talked about earlier when

you have that amount of information coming at you all the time and you have

a machine that's as complicated as the LCLs it's really easy to miss kind of

simple stuff so this is an example from a few years ago where the FEL pulse

energy suddenly dropped and it was really difficult to chase down why that happened it could be any number of

things one thing that doesn't happen so often is the cathode QE suddenly

changing in this case that's what happened and so sort of the last thing

that anyone was thinking of and it took hours to find out that that was actually

the cause if there was sort of it just a simple

looking at this scene that suddenly there was a change in the average cathode QE this would have been caught

immediately and we could have had a flag out to the operators to say this look

here this might be where the problem is so a big point of a lot of this work is

really to see if machine learning can help out with some of these tasks individually and also provide better

tools to the operators as part of that and it's also this question of can we reach better operating conditions or

find new capabilities if we can actually tweak knobs across the whole machine if

we can spend more time tuning the machine versus chasing down problem

problems like this so that's really part of the big vision so I want to introduce

I guess some challenges that come up in optimization in the context of having

met all of these knobs that you're adjusting to try to come up with some specific being characteristics so we'll

go to this kind of toy example where here I might be a hiker that has decided

I want to camp out for a while and my goal is to camp out at the highest peak but it might be really foggy out maybe I

don't really know anything about the landscape that I'm wandering into so all I can do is see if I take a step in some

direction am i improving or not and so I can keep doing that and eventually I

come to some peak here and then as I continue to move forward I start to see

that I'm starting to go downhill so maybe I just because I can't see what's

going on I just go back to this this peak but in reality I'm only at this

local maximum and I've missed the peak that I actually want to be at on top of

this if I stay here for a while if it's snowing or something maybe I'm camping out for months and I want to keep being

at that peak this might drift over time and so in order to actually stay on this

peak I probably have to keep optimizing continuously

so this is sort of a simple problem to think about in one dimension but when

you start talking about two dimensions already this starts to look a little bit mind-boggling and then when you're

talking about potentially dozens to hundreds of parameters that you might want to optimize simultaneously this

starts to look very very hard and it is so to look at an example of how this is

done in practice on a real machine we have for instance some Linux phases and

some energy settings that we can control and we can do exactly the sort of

procedure that I just described where you look at local changes and then slowly move these control parameters

simultaneously in a direction that seems to give some improvement in this case in the x-ray beam energy so here you can

see we're slowly moving towards some better value for the x-ray pulse energy

and in fact if I have some major change that occurs I can kind of track this so

I can account for a drift in the system or sudden changes but this changes a

little bit if I have a model of the world around me right so this could be a

GPS or a map in this case and no even if

I can't really see what's going on when I get to this point I still know that I

haven't reached the peak and I know that it's out here somewhere and so I'll keep going until I can actually verify okay

this is the this is where I want to be this is where the peak is and in fact

they don't actually have to walk there anymore maybe I can just helicopter in and land there convenient

even if my GPS isn't perfect or my map isn't perfect as long as things aren't

drifting too much I'm still probably pretty close to where I want to be and I

can just you know walk around a little bit and get back up to the top of that now snow-covered peak with

we've been trying to do the same thing with the LCLs so for instance if I have

a target beam shape that I want in this energy position space I can take a bunch

of historical data and train a model to actually map some requested image that

corresponds to this beam just some suggested initial settings that I might use to get there and we did a test of

this just with two parameters because we weren't sure how well this would work and wanted to keep it simple and then we

also added in closing the gap a little bit with a local optimizer afterward and

compared this with just using the local optimizer so we messed up the beam gave

it this target and then what we found is that the local optimizer did indeed get

stuck in in a region of parameter space that I couldn't get out of to reach this

target value but when we had started with some initial settings from model we

were able to get there if the real system changes a lot over time like we

we know the LCLs does one might think that there could probably be a better

way to incorporate this uncertainty in choosing control parameters so for

example if we know over time that we have this kind of variation in something

like snowfall maybe one year the snowfall was really crazy but that doesn't normally happen we can take all

of this variation into account and form some model based on that that has some

error bounds so we can say it's probably most likely gonna be a maximum around

here in any given year and we can then go and go right to that point in our

real world where this really starts to become useful is if you have a region that's totally unexplored so maybe I've

totally mapped out this space but I haven't looked here very much I no longer just have this picture instead I

have this whole other region that I haven't looked at it at all that has a really high uncertainty and I might have some

average that my model predicts but it's likely to be wrong the uncertainty is really high and if I am just basing this

on where my average is and where the potential so sort of how uncertainty am

about how high this peak could be this actually looks like it could be an even better spot than the spot that I've I

know is there so if I helicopter over and I look there I see that it's now in

fact a better place to be I now can close the uncertainty about in

my model about that point and I still may not have a great idea about what's going on in these other regions but I

know a little bit better and so essentially exactly the sort of analogy

is something that we've used at LCL us to tune quadruples so in this case

quadruples are just focusing magnets that are used to make the beam size such

that it'll go through the undulator properly and we found over many repeated

trials that if we use just sort of a standard local optimizer in this tuning

process we don't do as well as when we use this kind of machine learning model

that has some understanding of both what's going on in the space and what

the uncertainty what the uncertainty is likely to be but we can actually do a

little bit better than this we can include some physics so we know for instance that two adjacent magnets will

be kind of anti-correlated in the x-ray pulse and energy so we can actually encode this in a way that we set up this

particular kind of machine learning model when we do this if this was our

true system that we were trying to model and we encode some expected correlation

in the model and we choose some points to fit we actually wind up with

something that presents the real the real system much more closely when we when we baked in

that physics understanding ahead of time and when we test this on LCLs we find

that it does help quite a bit to have this physics knowledge sort of included

ahead of time in the way that we've designed this machine learning model and

to again put some numbers on this you can sort of see visually that this is a

clear improvement however when we take into account the operating cost of the

Machine and the overall budget you can estimate that this is about six hundred dollars saved and this kind of tune up

is done repeatedly so you can imagine that if you could do this sort of thing

for a bunch of different tuning tasks I mean save a lot of money in the

operating budget and also potentially do more science because we're tuning up more quickly you can also include things

like requiring that the pulse energy doesn't go below a certain threshold or

that we're not losing beam into the into the beam pipe and potentially damaging

the machine and this works kind of nicely and it's it's done in the same

fashion as predicting the uncertainty of the output value that we care about so you also predict how likely it is that

you are to violate one of these constraints and you avoid exploring those regions so this has been tested at

the Swiss free electron laser and we're working to apply it to LCLs as well without including these safety

constraints you get a lot of intermediate dropouts in the x-ray pulse energy which is not great for a user but

if you include these safety constraints you don't have nearly as much of that

behavior and you're not dipping down to where this is to where this threshold is

set at all so so far I've only talked about one parameter output and in

reality in accelerators we care about more than just one parameter there are many parameters of the

being that we want to have conformed to specific characteristics so an analogy

to this is if I just care about expense

when I'm buying a car it's sort of an easy and easy trade-off to assess I just

have some some threshold that I'm not going to go above but in reality I have

other constraints I might care about how slow my car is maybe I have some maximum

level of slowness I'm willing to tolerate for instance so it's really a

trade-off now between how expensive of a car can I tolerate and how slow of a car can I tolerate and there's there are

some good trade-offs in the middle in fact you can maybe draw some curve here

and define some set of possible cars because if you have a car that sits in

this corner this means you've got a really really inexpensive really fast car which doesn't exist if you're up in

this corner unfortunately it means that you overpaid for a really slow car or maybe you were just paying for some

other trade-off maybe you cared more about energy efficiency and the

environment but in this space we're just looking at speed and expense so what

this line really means is that this is the best possible trade-off between speed and expense that I can achieve so

for this dollar amount this is the fastest car that I can get at that price point but I could be anywhere in here if

I don't do my research maybe I'll wind up paying a lot more for

a slower car so we want to do this in general for accelerators when we're

doing experimental setup experimental design and the problem is that is that

this really requires many evaluations so on the machine you don't really want to

do that and in simulation it's too expensive to do that thoroughly so

oftentimes we'll pick one or two beam parameters that we really care about we'll look at a really narrow range of all these settings won't

really explore this base fully so a natural question is can we improve this

and one way to do this is just to build essentially a model of the simulation

and you can do this using machine learning and what you end up with is a

million times faster execution than the original system we've done this for LCLs

we varied parameters over a really wide range that are used in in operations and

then we find that the agreement between the simulation and our model is very good so what this really means is that

instead of if I want to run one simulation of this with all of the

nonlinear effects included instead of needing a supercomputer I can now run this on a laptop and it also means I can

now run it in the control room which is pretty exciting so next the

question becomes well can I actually trust these models when I go to run an optimizer on them this is pushing it may

be outside of the exact region in which it was trained this is an important question especially for the use cases

that we care about so we we've done some studies here at slack to try to address

start addressing this question so essentially what we we've done is trained some machine learning models and

run the kind of optimization that we would normally run on a physics simulation on the machine learning model

and also run on the physics simulation and then we compare these trade-offs see

if we get the same answer so we've done this for kind of hard to simulate

systems at the beginning of the accelerator this is a an injector and we

have a number of input variables that we're adjusting and then we have a bunch of output parameters that we care about

comparing against with one another and we find when we do this is actually

the the the answer that we get is very good the orange line here is the

optimization that was done with the ML model and this is again it's just one projection this is just two of the beam

parameters that we compared and when we run the the solutions along this line

for the inputs into the simulation these are the X's we find that it agrees

really well so this was kind of surprising and quite exciting the really sort of amazing thing though

is that to get the same exact answer we needed a hundred and thirty times fewer

simulations so really what this means is that just to get the solution not just

to make sort of a faster packaging of it we're talking about nineteen minutes total including making the initial

training data versus thirty-six hours and then in the end we've got a model

that we can reuse as well another interesting thing is that when we

compare the training data which are all these orange dots to the final solution

we actually have interpolate 'add well in this really high dimensional space

and come up with a solution that doesn't quite touch the training data so it's

it's really sort of indicative that this is doing something more than simply just

memorizing the training set it's able to interpolate in this high dimensional space how far we can push this in in

terms of how high dimensional we can go there's an open question this is an interesting initial result another major

problem that we have as I said before is that oftentimes our simulations don't

match our measured data so if we train a neural net or some other machine

learning model on simulation data will whine we'll still wind up with a model

that's wildly inaccurate relative to the real machine so now the question becomes

since we really can't afford to fully explore the parameter space on the machine

and oftentimes what you'll see in archive machine data are lots of there's

lots of fine-tuning around kind of discrete and separated operating points that's also not so great for trying to

train a model it's also really hard to work with real data on a drifting a huge drifting system um in simulation it's a

little bit easier so we can do things like do really broad parameter scans so now this question is can we train a

machine learning model and simulation data and then somehow update it with

measured data so that we are actually producing some more accurate representation of the system that also

will be applicable to regions that kind of lie outside of the training data set

that we have from the measured data and we've tried this with small systems

again where we have a number of beam inputs machine inputs and a number of

beam parameter outputs and then I've done some fine-tuning based on measurements and we're able to bridge to

close that gap quite a bit and the real the really important thing here is that

you don't get this kind of accuracy just train on the simulate on the measured data alone in this case because we had

so few measured data samples to deal with so it's this is another interesting

area that we're exploring and trying to apply to larger systems like the LCS

another kind of important area that a lot of us are interested in is whether

or not we can actually learn something from these machine learning models can we find kind of unexpected correlations

in the machine that nobody knew were there and it seems like we can to some

degree so at spier 3 which which is here

at slack if you look in the in the data archive the efficiency with which

they're injecting into the into the ring varies quite a bit

and the operators were constantly kind of adjusting feedback settings in order

to chase this and make sure that it didn't drift too far below what they

required so what they did was take

machine learning model and through all of this archived data at it and found

that it was able to predict reasonably well they used a bunch of measured inputs from the machine and when they

look at the sensitivities of the output to these various input parameters they

found that there was a really strong correlation with a ground temperature

sensor in the parking lot and that's not

likely something that someone would have found on their own just digging through the data so now the other than being

kind of funny the useful thing here is that now they can most likely just look

at this ground temperature sensor and then based on that get some idea of what

the ideal position set points are and just punch those into the Machine and

then hopefully they don't have to do as much reoptimize ation so another area

where there's a lot of interest and a lot of work being done is trying to use

machine learning to extract more information out of a lot of these complicated it's complicated signals

that we get off of the machines so for instance we have lots of images of the beam we have things like RF time series

readouts and just to give you a couple of examples of this we have at plc LS

this really nice diagnostic that gives us this energy time image of the of the

electron beam and this is used frequently in user operations so for instance you can take an image where

you've suppressed the lazing process so you're steering the beam in here in a way that doesn't

allow it to actually produce a photon beam and then during the experimental

run you can constantly be taking these images and these spikes here that you

see are actually indicative of the electron beam giving up some of its energy to the photon beam so what this

does is it allows you to use the combination of these two images to infer

information about the photon beam that you normally wouldn't have access to so

this is this is called x-ray power profile the problem is that this is kind

of a slow algorithm it also fails in cases where these spikes start to kind

of fold over on one another because it's just looking at vertical slices of the image so this question now is can we use

a neural network to look at this image directly and have better information

about like how different positions are are correlated with one another and then

get an estimate of this x-ray power profile more quickly and maybe a bit more accurately and we've done we've

started working on this and there's some initial results that look pretty promising so this blue curve is the

traditional reconstruction algorithm and then the red curve is the prediction

that we get from this convolutional neural network that we've trained on lots of simulation data and the green

which you can barely see is the actual predicted x-ray power profile so this is

already outperforming the standard reconstruction algorithm how this will

translate into the real world how well this will generalize to new machine States these are the kinds of much more

challenging you know open questions that we're working on trying to answer now another major area of interest in the

community is can we use these kinds of techniques to identify ones thing is going wrong with the

accelerator so for example for LCLs 2 which is an upgrade to LC us we're

relying on superconducting radiofrequency accelerating cavities and

these are really sensitive they also have these very large cooling systems

associated with them which have their own complex behavior and if one of these

cavities fails in what we call a quench it can be pretty bad this this cryo

plant can take a while to cool back down we can damage the cavities

it also affects user time obviously and it's not so simple it's not as though

these things fail the same way each time so right now oftentimes experts in

sitting in the control room will look at RF waveform data and try to say try to

figure out what kind of quench does this look like what what kind of actions can

we take or if they're looking at a past example did we take how effective was that how can we fold that into improving

next time and this is as a result kind of a slow laborious process at Jefferson

Lab that they've started doing some work that instead uses various machine learning algorithms to look at these

signals directly and automatically classify what kind of trip it is and then now that they've implemented the

system they're building up this huge database of examples of types of trips

what action was taken to try to recover from that trip and hopefully that'll

help them figure out new ways of informing you know if suddenly from the

control room I get a signal that this type of trip is happening or is about to happen maybe now I know better what

things I need to change like lowering the amount of RF power going into the cavity by a certain amount in order to

either avoid the really bad consequences that might come out of that trip or just recover from it more

quickly sort of a similar example where

you're trying to figure out whether to trust a signal or not or whether

something's going wrong with that signal is this case of trying to assess when a

beam position monitor might be giving you a faulty reading so in any accelerator we have some ideal beam

trajectory or beam orbit and oftentimes what you want is this just to go along

the center and the way that you do this is by adjusting steering magnets along

the beam line and along the whole thing you also have beam position monitors that can tell you how offset you are in

in the space but sometimes these give you bad bad readings so if you get a bad

reading you really don't want to be adjusting this magnet in order to compensate maybe this reading says the

theme is way over here so then you make this magnet really really correct that

beam to go away over here but if this reading was wrong and the beam was actually in the middle now you've steered the beam into the wall this is

the sort of thing that you want to avoid we also use beam position monitors to

get some idea of what the kind of optics overall in the accelerator are so you

also don't want to be subject to these kinds of errors when you're trying to get a measurement of what's going on in

the machine so we have standard

techniques to deal with this but it's not totally perfect it won't remove all of the bad BPM signals a group at the

Large Hadron Collider has been using machine learning to try to classify when

a when a BPM is giving a bad reading or not and then throw it away before using

it in a correction or using it to give some information about what the machine is doing so this is just showing how

many spikes how many bad bpms are removed with the standard algorithm this is the

number that are still remaining afterward and then this last these last

few that are really tricky to identify they were able to remove with the

clustering algorithm called an isolation forest and they also found that when

they apply this to their optics measurement they wind up with a much

less noisy signal so this is a much more reliable estimate of what's really going on no machine

so overall each of these applications

kind of fall into a few major use cases so we want to use machine learning to

help speed up system optimization we want to have faster models that are more

accurate relative to what's really going on in the machine we want to be able to get more useful information out of all

of the machine signals that we do have we also want to find out there are

hidden sensitivities in the machine that we can you know maybe take advantage of to help help out when we're trying to do

tuning and then we want to detect when something's going wrong in the machine we'd also like to predict when something

Bad's about to happen like if we can predict a an RF cavity quench we can

avoid it all together so I showed a lot

of work from what's sort of a growing international community that's

interested in this particular research area so in February of last year we had

the first machine learning for particle accelerators workshop this was kind of the first time people in who were

interested in this area came together and exchange ideas talked about challenges that they have at different

facilities and you know it's it may look

like a small group but this is growing rapidly and now it's like we have a

small group that's working specifically on applying machine learning for

accelerators so many of these people are actually kind of full-time working in this area it's pretty exciting and I

think in general a lot of the consensus in the community is that machine

learning is really a complementary approach it's not going to fix all of our problems it's something that

provides a set of useful tools that we can maybe use for some of these specific

areas where we know we have a need but overall it's it's been sort of not such

a well explored area yeah there's a lot of opportunity to try new things and see what's gonna work and what's not going

to work and get big performance gains and along with that I mean we're very

much still in the R&D phase like we don't we don't know fully what's gonna work what a lot of the pitfalls are

going to be we know that there are a lot of open questions about you know how to

update our models over time how to have some prediction of model uncertainty

along with just having some raw being parameter output these sorts of things

but the general hope is that by using these tools and kind of figuring out

what works and what doesn't we might be able to open up a lot of new capabilities so maybe we can reach more

challenging beam configurations at the LCLs that we weren't able to do before by just tweaking lots of knobs

simultaneously or getting some better understanding about sensitivities in the machine that we weren't aware of we also

might actually be able to help out a lot with optimization of systems that have

traditionally been really really difficult just because they're so nonlinear and so complicated and really

hard to tune things like plasma based accelerators for instance in some cases

machine learning might not just be a help here it might actually really make some big leaps forward as far as being

able to do sort of larger scale system optimization to bring things like the

energy spread down for these types of accelerators um we also if we can speed

up our simulations maybe we can do more thorough optimization again tweaking lots of

knobs across the whole machine maybe we can plan out our experiments better ahead of time so that we can really set

up the machine well right at the start rather than doing a lot of kind of hand tuning tuning during during the during

the user run we also have a lot of

opportunity to improve existing capabilities so simply doing faster tune-up and avoiding failures that'll

enable us to do more science it's also definitely important that a lot of these

tools are used to really help the operators do things a little bit better

so for instance if you know an operator can see that a certain kind of trip was

about to happen with an RF cavity that helps them that helps them out a lot because then maybe they don't have to

spend the next 30 minutes trying to figure out what kind of trip that was it

also this is also a case where we can take technology that we're developing

for accelerator systems like the LCLs and maybe then transfer it for out into

the medical and an industrial setting so that we can do things like finer control

over beams for proton therapy so going sort of out of the R&D stage to tech

transfer it's also I think important to

note that a lot of what we're working on and likely will require a lot of

development and support at the beginning but so far a lot of the initial signs that we have are that this should this

should pay off everything looks fairly encouraging it's also kind of an

interesting way to tie in machine learning research we actually have a pretty unique set of systems we have a

lot of data we have a simulation of the environment that's that is somewhat good

relative to the real system unlike something like a self-driving cars model the whole worlds it's very hard to

simulate um we also have really high dimensional parameter spaces and then we

have these issues where things are drifting overtime these are kind of really interesting challenging systems

to work with from machine learning perspectives so there's an opportunity to give back to machine learning research here as well so with that um I

hope that this gave you kind of a high-level view of the things that we

are looking at and wanting to do with machine learning for particle accelerators and that this is a kind of

an opportunity to think about you know what else could we do what uh what new things could we do if

we start using a lot of these tools thank you [Applause]

so if you have any questions for Li and please press the speaking Man button on

your microphone please I'm a little

confused you've got data from operating the system and you've also got a simulator

what should you use to train the neural network yeah so ideally we want to use

both right but on their own individually they both have their merits right if I

just if I simply want to take a really slow simulation and I just want to speed

it up I don't actually need to use any measured data I can just generate my simulation data train my model and now I

can just use that as a faster executing version of what I had before if I am

working with a system where I don't really care about using it for optimization right maybe I just want a

prediction and I know I'm not going to be going into new machine States maybe I

just want to do really well at predicting some information based on some like other measured data in that

case you may not really care whether you're covering other regions of the parameter space as well so then maybe

you can just take a chunk of measured data and just train on that so the simulator lets you go places that you

haven't actually operated in that's correct I see thank you it also it also helps as far as actually being efficient

about how you're training the model so for instance if I only have enough beam time to get 300 data points on a given

machine that's not enough to train something like a neural network model but if I have a simulation of that part

of the accelerator I can just generate a ton of data with that train my neural

net and then fine tune it with the 300 data points that I have and then I'll

wind up with something where the simulation is giving me something close enough to the right answer that it only

needs those 300 data points to converge so there are all sorts of ways where

you can usefully combine both simulation data and measuring data as the operator

the default operator tools actually been changed in accordance with the machine

learning results that you've gotten yes so for the Gaussian process work that I

talked about where we were trying to maximize the FEL pulse energy by adjusting focusing magnets that's been

fully integrated into the LCLs operating system and actually is reduced tuning time for that task quite a bit over the

past year great really fascinating work

I just have a one question I want to ask right now what packages are libraries are you using primarily in your research and

development yeah so I'm speaking for myself I tend to use PI torch I also for

simple prototyping use Kerris for doing optimization there's a package called

deep which is really good for multi objective optimization and evolutionary algorithms a lot of other people in this

in this area who I work with find that caris is often all they need for a lot

of these problems much of what we're tackling right now is still simple enough that we don't really need to we

don't need the performance enhancement that you get from using pi torch or raw tensor flow

maybe a bit more of a generic question I thought it was really fascinating to see what a step up you can get in

performance from from using neural networks rather than the previous simulation you had do you think we're

reaching a point in time where maybe improvements in software are overtaking

Moore's law in hardware and if so would you expect to see sort of a Moore's Law equivalent in software going forward

yeah that's a really interesting question and it's one that I'm frankly

not really qualified to answer what I will say is that there's a lot of work

that gets put into making particle accelerator simulations themselves

faster that includes things like converting the code so that it can run

on GPUs converting it so that you can run massively parallel simulations in an

efficient way because you're dealing with you know essentially a cloud of electrons and they're all interacting

with one another so this is something that you know you can paralyze that computation what I I guess the the real

takeaway for me from that particular line of work where we've used some

simulation data to train a neural net is that this is now something that I that once took you know many many minutes or

many hours to run that I can now package in a way that can run on my laptop in under a millisecond and most of you for

to most people that I've talked to that seems like something that we would have

a hard time getting to just purely algorithmically on the simulation side

certainly if you have some extremely large supercomputer and you have tens of

thousands of core hours available to you on it you know that's one route to

speeding things up but in terms of commenting on you know scaling laws and

sort of what fundamental limits we might hit up ya be hitting with regard to computation I can't

I can't really comment good thank you

for a wonderful talk could you say more about the parking lot temperature there must be a good story that follows that

yeah so actually a lot of the speculation around what that might be

centers around this is the the fact that you don't have perfect temperature

control of the actual accelerator itself right and these are all components that will expand and contract thermally so

it's possible that the the parking lot temperature sensor is and it's it's underground it's really just giving you

some indication of how much thermal expansion or contraction is really going

on in the actual accelerator and this kind of makes sense nobody's gone and

verified whether that's the case you know measured distances or things in

surveys between you know some of the summer months and some of the winter

months but that's sort of the reasonable explanation that comes to mind I'm sure

someone will dig into it further

thank Orly again for an excellent [Applause] [Music]

[Applause] [Music]

All content is © SLAC National Accelerator Laboratory. Downloading, displaying, using or copying of any visuals in this archive indicates your agreement to be bound by SLAC's media use guidelines
 

For questions, please contact SLAC’s media relations manager: 
Manuel Gnida 
mgnida@slac.stanford.edu 
(650) 926-2632 
 

SLAC is a vibrant multiprogram laboratory that explores how the universe works at the biggest, smallest and fastest scales and invents powerful tools used by scientists around the globe. With research spanning particle physics, astrophysics and cosmology, materials, chemistry, bio- and energy sciences and scientific computing, we help solve real-world problems and advance the interests of the nation.

SLAC is operated by Stanford University for the U.S. Department of Energy’s Office of Science. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time.

Featured in

Related event

Particle accelerators are used every day in a wide range of scientific, medical and industrial applications. But did you know that the task of operating these machines is far from mundane? For example, for every experiment at  SLAC’s X-ray laser...
Dig Deeper

Related images & videos