ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
Draft VersionSlide NumberCommentorCommentsApplied ?Reply
2
13Christophe"Masters" --> "Master" (two times)
3
14ChristopheThe définition of NN seems counter intuitive : are they made up of layers of artificial neurons or of matrices and vectors? Starting with the figure would be better I think.I added some slides for the explanation
4
1allChristopheYou miss points at the end of many sentences
5
16Christophe"your output" --> "the output"
6
17ChristopheYou have not expressed a problem, just facts; what is the problem?These computations are expensive in terms of time and memory making them hard to use in real-world scenarios
7
18Christophe"aimed at working on NN acceleration" --> "aims at accelerating NNs"
8
18ChristopheLearning and inference stages should have been defined before, when introducing NNs.I added a point on slide 5
9
110ChristopheSeems incomplete ...True. Covered by point 1
10
113 & 16Christophe"will different" --> "with different"
"with even" --> "even with"
11
115ChristopheMaybe you could just explain what is dynamic instead of putting again all the items and figure from the static slide?I do think the figure is important to remind them about how we are doing it. There is a bullet point describing what dynamic means
12
119Christophe"Using a code generated based on" --> "Using"
"evolution" --> "evaluation"
"polynomial evaluation" --> "polynomial approximation"
1. I think you mean "generator" since it was "generated"
13
119ChristopheThis slide implicitely implies you generated a single approximation for the lowest working precision, i.e., you can use it only for static strategies. Is that true?For now that's true. We don't have dynamic switching with the code generated yet
14
125Christophe"Publish" --> "Complete experiments and consolidate theoretical explanations and then publish"
15
125Christophe"a goal" --> "the goal"
16
126ChristopheSince there is a single item in each category, the titles could be singular
17
14-5Silvui"What are they" -> "What are they?"
18
14SilvuiI don't like the definition here either and it would be better to start with an image that gives the broad idea of what a neural network is. After you give the image representation, you can talk about the general form of a layer (i.e. matrix multiplications and activation function evaluations)
19
1allSilvuiI'm missing a context for your talk. You start directly with neural networks and what they are. The question you should be asking at the beginning is what are you trying to do and why? Then, when you go into what you've done so far, that will consist of explaining how you do it.Yes i thought about it. However during my talk i will layout the flow. I want to introduce NN because all my presentation has references to NN
20
16SilvuiThis slide is vague, you should differentiate between inference and training. And I would use an image to show how a DNN works.
21
17SilvuiIt would be great to have a more concrete idea of how the size of DNNs has been increasing in recent years; e.g. can you find a plot that shows how the size of DNNs has been increasing for various domains/applications? It's not really clear what is the problem you are trying to solve.
22
18SilvuiLeanAI is focused on training acceleration, and not on inference
23
111Silvui"We ran three experiments" -> I would formulate this differently, you ran much more than three experiments ;-) You can say something like "We worked on three main scenarios/topics"We worked on three main strategies
24
117SilvuiYou need a way to show on the plots when you do the precision switch, similar to what you have on slide 18
25
19WassimAdd images to make it clearer ( neural network and
26
19Wassimintroduce different acitvation functions with graphs
27
113WassimIntroduce MpTorch
28
1allWassimMake english more formal
29
114WassimAdd remark about forward & backward being different quantization
30
120WassimIllustrate the code generator
31
121WassimThe code generator should not be in the box.
32
126WassimMake the Plan as time stage
33
118WassimIn the test graph "e" is small in "Epoch"
34
1allChristopheIncrease font for slide numbersI will leave this until the end because I have to do them manually
35
114 & 17 & 18ChristopheYou should say a word about timing there: the aim is "accelerating NNs", is it achieved?These are the software simulated experiments. We don't get any time benefits. Do you think I should state that ?
36
116ChristopheAnastasia's and Silviu's opinions needed here : Here you will trigger a question: why did you add a Resnet for this experiment? Since the explanation is purely chronological (if I remember correctly), you should perhaps use it also for static and display figures for it too?
37
117 & 18ChristopheSince dynamic precision requires monitoring, what is the overhead wrt static precision? You should say a word about that.Monitor the training and testing performance by checking if the accuracy is stalled after a certain number of epochs ( negligible overhead )
38
121ChristopheWhy only Resnet?After dynamic we only did the experiments on the ResNet
39
122ChristopheYou will have to explain why the low precision needs less epochs, and if the time gain is only due to this.Good point. Well this is due to the EarlyStopping mechanism. I will add that to the graph as an annotation
40
128ChristopheIf would recommend including these references in footnote in the slides where they are cited. Conversely, you could list here the most important items from the bibliography you have covered in the past months...
41
110SilvuiThis slide is definitely incomplete. When putting references in a slide, I would add them as a footnote to the page. You need to be more complete in your presentation of the state of the art, and in general, you should try to constantly follow if there are new works appearing that use low/mixed-precision computing for training acceleration.OK
42
112SilvuiYou should be clearer about the fact that you are quantizing to a lower precision; you should have a slide in the beginning about quantization, since lowering the numerical precision is the main way you want to accelerate training in your workNot sure if still applicable
43
113 & 16SilvuiYou should do more experiments with more networks and datasets. Why are those theoretical results and not experimental results? What is the theory behind them? You talk about quantizing activation functions, but you never say which activation functions are you quantizing in your experiments (e.g. GeLU, tanh, sigmoid?). You should also run the static and dynamic quantization experiments on the same networks and datasets, since this would allow you to compare the two approaches. You also need a slide where you compare the two approaches (static vs dynamic) and you need to discuss the possible advantages and disadvantages of both.
44
1WassimAdd comparison slide between static and dynamic quantization
45
119WassimAdd a second experiment for ResNet
46
215WassimAdd Literature review
47
225WassimAdd annotation on the test accuracy graph
48
229WassimAdd better graphs
49
25WassimAdd applications of DL in the space below
50
26Wassimincrease images slide
51
27WassimRemove the bracket and the equation below
52
27WassimAdd input to layers notation
53
28WassimPut the formula under the big box instead of under the layer naem
54
28Wassima l -1 remove paranthesis
55
28WassimSpecify that there are multiple types of layers
56
28WassimSay that W & b are weights and biases
57
28WassimDecouple the layers and activation functions
58
29WassimIn the derivative, put the weight update
59
29WassimRotate the images and transition from forward & backward instead of 2 images
60
210WassimRedundant, add it to slide 9
61
211-12WassimMerge them,To be discussed
62
212WassimFix wording
63
212WassimAdd leanAI partners
64
215WassimHow do you position yourself in the state of the art
65
216WassimAfter literature review add idea of what you are doing
66
2allWassimMove from quantization to uniform precision switch
67
2allWassimRemove software simulation
68
219WassimMove image from 20 to 19
69
221WassimCan be removed and just said on slide 20
70
223WassimChange text
71
225WassimRemove arrows
72
228WassimLow Accuracy instead of lwo precision
73
2AllWassimDynamic Quantization -> Dynamic Precision Switching
74
227WassimAll some text about the generate code
75
231WassimRemove the summaryI still feel like that it's good to have it
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100