Transcript/A²I: Abstract² Interpretation

From PKC
Revision as of 14:05, 15 January 2023 by Benkoo2 (talk | contribs) (→‎Assembly AI processed Transcript)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Assembly AI processed Transcript

Original Page

Everyone, welcome to the second session on Abstract interpretation. At the end of the session, time for questions. In order to keep the things rolling quickly, I'll ask that people in the room come up to the front to grab the mic, speak into the mic for the remote recordings. Also a reminder that you can ask questions on the slide without further ado. Then I will hand it off to our first speaker this morning, Patrick Guzo. Okay, thank you. I start with an animation done by my course. They like animation? Yeah. You think? It's not me. It's Jacob. So I had the first question. Who was attending purple, 1977. Me. I was the only one.

I have another question. What is the invariant in these papers? So if you are cited, you have already understood the talk. You can leave.

This is an int, the one. So they analyze the analysis and that's the subject of the talk today. I will formalize it, and I start with a generic abstract interpreter. I will define its semantics and the abstraction, and it's all finished. So this is a generic abstract interpreter. You iterate from some value. While you have not converted, you to compute the next iterate by your transformer. And I will change that a little by adding an index which says at each step the abstract domain is going to change. So if I change the abstract domain, I have to change the convergence and I have to change the transformer from one domain to the next one. And also I add a passage to the limit. And I do that because I want to be able to say that semantics are instances of the abstract interpreter. So if you take denotes and all semantics, all decays are the same, they are DCPO. The first, you start from bottom, you iterate with Scot transformer it's continuous, you never stop. And at the limit you take the joint, which is the least fixed point. So you can program that. Then if you can also do parameters with this metallic pressure and if you have a widening, you may want to change it over time. That is, at the beginning. You delay, you make no widening. And then when time goes on, you make it more and more imprecise to enforce conversions. So that's why I have the case. So this is the semantics of this abstract interpreter. I collect the sequences that I have computed, so they are related by this transformer, and maybe they converge. And at that time I repeat always the point where I was at convergence. And then I pass to the limit with this function, or it doesn't converge and it goes forever. And then I pass to the limit. And so my semantics is a set of these guys. And my collecting semantics, the strongest property is a set of a single thing, which is the set of these guys. That's classic. So I have some remarks now, the semantics of the generic attract interpreter is an instance of the abstract interpreter, then the collecting semantics is also an instance. And if you make an abstraction which is sound of the generic abstract interpretation, you have an instance of the generic abstract interpretation. So there are two consequences. The first is you can have a hierarchy of abstractions abstract interpreters, and the other trick is that the generic abstract interpreter can be used to analyze the generic abstract interpreter or any of its instances. So that the idea of a toy.

I will start with a very simple example. You see, y two increment x by two. And my interpreter want to do an interval analysis. So the transformer established this equation at 0.1. Either I come from zero and x is zero, or I come from two. And so the values of x at zero one are the one at l two attracted in two intervals. Or when I am at zero two, I am coming from zero one where the invariant was true, I am incremented by two. So if I solve this, I have this infinite iterate. And when I pass to the limit, I get this. But I cannot do an infinite computation, so I will do an analysis of this analysis. So to do that, I need collecting semantics. So what I will collect here is the sequences of intervals that I have computed. So the sequences at 0.1 is the previous one. And then I add my equation where I take the value in the sequence of the other point, l two. And at l two the sequences of computation is last I got at x one, where I increment by two. And so my collecting semantics now is going to compute this guy. And when I pass to the limit, I have this set. And so of course, I cannot compute the collecting semantics of the abstract interpretation, but I can do an abstraction of it.

So the abstraction that I do is using tilde domains as a bottom constant and top four pairs. So you see, I have pairs here. But the abstraction is not the constant fabrication, it is this one. I abstract component wise. And when I have an infinite sequence, I take the lub of the lower bound and the lub of the upper bound. So if I get a constant, it means I have got a stable bound. If I get top, it means I have a non stable bound. Okay? So now I can derive this equation from this one by abstraction this way. And I get this. For example, the lower bound at 0.1 is the one I had before this one upper bound plus zero, which come from the equation initialized to zero plus this term that comes from the equation increment by two. And now this I can solve and reiterate are finite. And the result is this. It says the lower bound in both places are stable and the upper bound is not stable. So your problem of stability is only on the upper bound and it gives you where you want to do widening. So the meta analysis gives me where I have to do the widening in the analysis. Okay, and how do I get this equation by this computation. And it's very simple. You start from something which is more or less saying. The abstraction of the concrete is you unfold, you are there. Then you have to have some intelligence. So there is one critical step here. And then you say fold and you get the result the abstraction of the abstract function of the abstraction of the concrete parameter. And then you apply the fixed point transfer theorem and you approve your abstraction is correct.

Okay. So there are two kind of meta abstract interpretation. The ones that are offline that you do before the analysis and the ones that are online, which are my preferred one, you do it during the analysis. So offline we have already seen one example, which is make an analysis of the analysis to determine the widening of the analysis. Another one was a packing of astray where you have octagons of tens of thousands of variables, so it doesn't scale. And you will pack variables in two small groups where you are in the same group you are related. If you are a different group, you are not related. This group you choose then to be of time, ten or 20. Then you scale and you lose some information. But pre analysis will determine the fact. I don't insist too much.

I go to online and to do online, I start from the abstract interpreter that we have seen before and I keep an abstraction of the computation. So this is in purple, you see my first for the initialization I keep an abstraction. And then here when I compute next step, I compute what I add. So I have the sequence of computation. I add the last one and I re abstract the result. It's a classical abstraction. So you see, alpha x bar accumulates in the abstract the sequence of computation I made for x. I see my course, I say yes, I am not yet wrong and I will pass that to the metaphor interpreter. And what I pass is the abstraction of the computation plus the current domain and it will return me the next domain. Of course, if you don't like meta track interpreter, you expand the code here and you say I have no abstract interpreter. This way people have no idea, nothing. So macro expansion is not very interesting. And here is the meta stuck interpreter. You see, you have it here, it received the concrete-isation some domain and from that it will make some iteration because it's an analysis and it will return the next domain. Okay, you see, that very simple.

So we have to have at least one application. I will do an application to the design of wide domain, not the same I did the first time. The first time it was offline. Now it's online. During the analysis I decide of the widening. So I take back intervals. You analyze the variable over time. You are in an interval that grows. That's what the analyzer is doing. So I have to find an abstraction for the meta analyzer. And my abstraction is the slope. You see, I take the slopes here going from one point to the other. The slope here is zero. The slope here is this more or less I made it the picture. So now instead of having a sequence of iterates, I have a sequence of pair of slopes, but it's still infinite. So I have to do something about it. I could take the average, whatever you want here I take the maximum. And so if I take the maximum, I am in a domain which is infinite. So when I do the analysis of the interpreter, I go to the meta interpreter, but the domain is still infinite. So I need a wider link in the meta abstract interpreter. So you can imagine any one like threshold or thing like that and this will enforce convergence. And so this will give me a widening in the abstract interpreter which is determined by an abstraction of the computation and meta widening in the meta abstract interpreter. A beautiful no. Who said no? Okay, so let's have another application which is two relational domains and I am ahead of time. So if you have a question, I can answer it. No question. So when you take numerical or rational domain like Octagon, they are polynomial NS cube, the other polydrive exponential in the number of variables. If you have tens of thousands of variables, you are not going to make it.

So the idea is to do the packing, that is you decompose into a conjunction of relation that are on separate packs of variables. So we have seen that already by mine in the Octagon, but mine was doing a pre analysis, so an offline abstract interpretation. Whereas here we want to do that online. And this was invented by Alvax and his colleague. Then you had Singh and Pushal and Vichaf. They must be somewhere here and they generalized to Octagon first and then to arbitrary rational numerical domains. And the trick is that the decomposition brings no loss of information. So what you do, you generalize to any relational domain and this is in this paper. The trick is when you have arbitrary additional domain, you may lose some information. You don't have enough information in the abstract domain to be extremely precise, but it works at least for the complexity. So a relational domain is relation between things can be variable, point whatever you want addresses on the hip and you see that usually it's exponential of knowing the number of things. So here it's e power six. So the idea is to do a dependency analysis of relation and you see that x one, x two, x three, they depend on one another, x four depend on nothing and x five and x six depend on themselves. So what you are going to do is move from this to this domain, which is three instantiation of the basic domain for each one, for the back. So now instead of having exponential of six, I have three time maximum exponential of three, which is much smaller.

So it's much more efficient. And of course, when you do some operation, for example, you establish a relation between x two and x four the packs are going to change. So this x four coming the pack, this one is unchanged. And so the analysis change the abstract domain at each step by computing the pack. Oh, that's beautiful. I think you have a different abstract domain and a different transformer at each iteration which come from the initial transformer that you had defined for the analysis. And what you have to do is functor that move from packs of variable to this and so all the abstract domain, d one, DK, d infinity, they are just instances of the punctual abstract domain that you have created. Okay, there are more in the paper, like more semantics, more abstraction, more algorithm, more what you want and it's time to conclude ahead of time. So abstract interpretation, I understand it as a way to explain things, to construct things, to derive an algorithm, to do calculational design. And we have shown in the past that it applies to deductive analysis between the 80s data fluid analysis, we did that in the 70s into some purple paper model shaking. We had a paper champional. Abstract Interpretation 2000. It showed that model shaking is an abstract interpretation of something, by something.

The trick is that in this in this one, we also show that if you have a finite, finite domain with some logic, it's not decidable to do the model checking. So finiteness is not enough. You also have to have properties on the abstract domain, which is what you get from the temporal logic you use types it's proposal 96. We prove that in the Miller there are an abstract interpretation intersection type. No one ever read this paper. So if you have nothing to do, you can read it. Symbolic execution in what? In my theory. And today you can apply a such interpretation to itself. That's nice, thank you. You're happy and thanks for reminding us. This is a distinguished paper and for the first question I'll take, I am finished. The first question will be from someone anonymous or maybe it's a duck anonymous, I don't know. But it's an intelligent question, which is when you do offline meta analysis, you analyze just the analyzer or the analyzer applied to a particular program as you want. But in my presentation you had a P in parameter and so you can do it for a program.

I think it's better to do it for a program, it will be more precise. For example, the packs. If you want to do packs for any program, you have no chance to get anything. If you do packs for one program, you will have something interesting. So the idea is you build the equation of the concrete analyzer and then the meta analyzer. You take this equation and you abstract them. The abstraction will be always the same, but it will be applied to one particular program, in which case the offline analysis will be doing some of the work in online analysis would do. The offline analysis would be doing something like what the online analysis would do. In some sense, it has to the offline will be less precise because you have to do something that is valid for all iterates in the online. You have to do something which is valid for the past iterate and you can observe the evolution of the iterate. So you can have a three, a four, a five abstract interpretation.

But I registered my colleagues that wanted to go there. Two is enough. If you count one, two, you made it. Other more questions. I can answer a question of the previous talk, which was, can we automatize the computation I show in Agdar with Van Horn and paper of two or three years ago? The answer is no, because the abstraction are homomorphic, that is partitioned and the contour example is intervals. So it doesn't it's worth only very specific kind of abstract interpretation, which is very limited. Yeah, thank you for the awesome talk. It was beautiful. Could you comment on the performance performance of the online what is the question? Can you repeat that? Performance of the online what online interpretation? Because if it's offline, it's going to be done once. But it's online. You're changing a spectrum. That is, when you take the relational example, you have an extra cost to compute the partitioning. So I have no general answer because it depends if you have a relational, which is trivial. But the vet chef and company guys, they prove that at least in numerical domain, this computation is very small compared to the big computation that you do on Polyedra and things like that. So I have no general results on that. Okay. Do you have applications in dynamic program analysis? I know you mentioned it, but you didn't. Dynamic program analysis, you have interval analysis done by Moore in the 66. You can formalize it by an abstract interpretation. You abstract one trace and the abstraction is many traces. But I mean for the meta, I can see you're using meta. The meta do dynamic no, I have no idea. I have to think. Okay. Which is my job. Yeah.

Auto Chapters

ABSTRACT INTERPRETATION

Second session on Abstract interpretation. At the end of the session, time for questions. In order to keep the things rolling quickly, I'll ask that people in the room come up to the front to grab the mic. Then I will hand it off to our first speaker this morning, Patrick Guzo.

INTERMEDIATE SEMANTICS OF A GENERIC ABSTRACT INTERPRETER

Jacob: I start with a generic abstract interpreter. I will define its semantics and the abstraction. What is the invariant in these papers? Jacob: If you are cited, you have already understood the talk. You can leave.

THE ABSTRACT INTERPRETATION OF A GENERIC INTERPERSON

You can have a hierarchy of abstractions abstract interpreters. The other trick is that the generic abstract interpreter can be used to analyze any of its instances. There are two kind of meta abstract interpretation. The ones offline that you do before the analysis and the ones that are online.

FOR EXAMPLE, THE META-WIDENING OF THE ABSTRACT INTERPRETER

If I take the maximum, I am in a domain which is infinite. So I need a wider link in the meta abstract interpreter. This will give me a widening in the abstract interpreter which is determined by an abstraction of the computation. A beautiful no. Who said no?

EXPLORING THE OCTAGON WITH TWO RELATIONAL DOMAINS

The idea is to decompose into a conjunction of relation that are on separate packs of variables. The trick is that the decomposition brings no loss of information. What you do is generalize to any relational domain. When you have arbitrary additional domain, you may lose some information.

ABSTRACT INTERPRETATION IN DATA ANALYSIS

abstract interpretation, I understand it as a way to explain things. And we have shown in the past that it applies to deductive analysis between the 80s data fluid analysis. Today you can apply a such interpretation to itself.

OFFLINE VS. ONLINE META ANALYSIS

When you do offline meta analysis, you analyze just the analyzer applied to a particular program. I think it's better to do it for a program, it will be more precise. Could you comment on the performance performance of the online what is the question?

INFERRING FROM PROGRAM ANALYSIS

Do you have applications in dynamic program analysis? Dynamic program analysis, you have interval analysis done by Moore in the 66. You can formalize it by an abstract interpretation. The meta do dynamic no, I have no idea.