A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Draft Version | Slide Number | Commentor | Comments | Applied ? | Reply | ||||||||||||||||||||
2 | 1 | 3 | Christophe | "Masters" --> "Master" (two times) | ||||||||||||||||||||||
3 | 1 | 4 | Christophe | The définition of NN seems counter intuitive : are they made up of layers of artificial neurons or of matrices and vectors? Starting with the figure would be better I think. | I added some slides for the explanation | |||||||||||||||||||||
4 | 1 | all | Christophe | You miss points at the end of many sentences | ||||||||||||||||||||||
5 | 1 | 6 | Christophe | "your output" --> "the output" | ||||||||||||||||||||||
6 | 1 | 7 | Christophe | You have not expressed a problem, just facts; what is the problem? | These computations are expensive in terms of time and memory making them hard to use in real-world scenarios | |||||||||||||||||||||
7 | 1 | 8 | Christophe | "aimed at working on NN acceleration" --> "aims at accelerating NNs" | ||||||||||||||||||||||
8 | 1 | 8 | Christophe | Learning and inference stages should have been defined before, when introducing NNs. | I added a point on slide 5 | |||||||||||||||||||||
9 | 1 | 10 | Christophe | Seems incomplete ... | True. Covered by point 1 | |||||||||||||||||||||
10 | 1 | 13 & 16 | Christophe | "will different" --> "with different" "with even" --> "even with" | ||||||||||||||||||||||
11 | 1 | 15 | Christophe | Maybe you could just explain what is dynamic instead of putting again all the items and figure from the static slide? | I do think the figure is important to remind them about how we are doing it. There is a bullet point describing what dynamic means | |||||||||||||||||||||
12 | 1 | 19 | Christophe | "Using a code generated based on" --> "Using" "evolution" --> "evaluation" "polynomial evaluation" --> "polynomial approximation" | 1. I think you mean "generator" since it was "generated" | |||||||||||||||||||||
13 | 1 | 19 | Christophe | This slide implicitely implies you generated a single approximation for the lowest working precision, i.e., you can use it only for static strategies. Is that true? | For now that's true. We don't have dynamic switching with the code generated yet | |||||||||||||||||||||
14 | 1 | 25 | Christophe | "Publish" --> "Complete experiments and consolidate theoretical explanations and then publish" | ||||||||||||||||||||||
15 | 1 | 25 | Christophe | "a goal" --> "the goal" | ||||||||||||||||||||||
16 | 1 | 26 | Christophe | Since there is a single item in each category, the titles could be singular | ||||||||||||||||||||||
17 | 1 | 4-5 | Silvui | "What are they" -> "What are they?" | ||||||||||||||||||||||
18 | 1 | 4 | Silvui | I don't like the definition here either and it would be better to start with an image that gives the broad idea of what a neural network is. After you give the image representation, you can talk about the general form of a layer (i.e. matrix multiplications and activation function evaluations) | ||||||||||||||||||||||
19 | 1 | all | Silvui | I'm missing a context for your talk. You start directly with neural networks and what they are. The question you should be asking at the beginning is what are you trying to do and why? Then, when you go into what you've done so far, that will consist of explaining how you do it. | Yes i thought about it. However during my talk i will layout the flow. I want to introduce NN because all my presentation has references to NN | |||||||||||||||||||||
20 | 1 | 6 | Silvui | This slide is vague, you should differentiate between inference and training. And I would use an image to show how a DNN works. | ||||||||||||||||||||||
21 | 1 | 7 | Silvui | It would be great to have a more concrete idea of how the size of DNNs has been increasing in recent years; e.g. can you find a plot that shows how the size of DNNs has been increasing for various domains/applications? It's not really clear what is the problem you are trying to solve. | ||||||||||||||||||||||
22 | 1 | 8 | Silvui | LeanAI is focused on training acceleration, and not on inference | ||||||||||||||||||||||
23 | 1 | 11 | Silvui | "We ran three experiments" -> I would formulate this differently, you ran much more than three experiments ;-) You can say something like "We worked on three main scenarios/topics" | We worked on three main strategies | |||||||||||||||||||||
24 | 1 | 17 | Silvui | You need a way to show on the plots when you do the precision switch, similar to what you have on slide 18 | ||||||||||||||||||||||
25 | 1 | 9 | Wassim | Add images to make it clearer ( neural network and | ||||||||||||||||||||||
26 | 1 | 9 | Wassim | introduce different acitvation functions with graphs | ||||||||||||||||||||||
27 | 1 | 13 | Wassim | Introduce MpTorch | ||||||||||||||||||||||
28 | 1 | all | Wassim | Make english more formal | ||||||||||||||||||||||
29 | 1 | 14 | Wassim | Add remark about forward & backward being different quantization | ||||||||||||||||||||||
30 | 1 | 20 | Wassim | Illustrate the code generator | ||||||||||||||||||||||
31 | 1 | 21 | Wassim | The code generator should not be in the box. | ||||||||||||||||||||||
32 | 1 | 26 | Wassim | Make the Plan as time stage | ||||||||||||||||||||||
33 | 1 | 18 | Wassim | In the test graph "e" is small in "Epoch" | ||||||||||||||||||||||
34 | 1 | all | Christophe | Increase font for slide numbers | I will leave this until the end because I have to do them manually | |||||||||||||||||||||
35 | 1 | 14 & 17 & 18 | Christophe | You should say a word about timing there: the aim is "accelerating NNs", is it achieved? | These are the software simulated experiments. We don't get any time benefits. Do you think I should state that ? | |||||||||||||||||||||
36 | 1 | 16 | Christophe | Anastasia's and Silviu's opinions needed here : Here you will trigger a question: why did you add a Resnet for this experiment? Since the explanation is purely chronological (if I remember correctly), you should perhaps use it also for static and display figures for it too? | ||||||||||||||||||||||
37 | 1 | 17 & 18 | Christophe | Since dynamic precision requires monitoring, what is the overhead wrt static precision? You should say a word about that. | Monitor the training and testing performance by checking if the accuracy is stalled after a certain number of epochs ( negligible overhead ) | |||||||||||||||||||||
38 | 1 | 21 | Christophe | Why only Resnet? | After dynamic we only did the experiments on the ResNet | |||||||||||||||||||||
39 | 1 | 22 | Christophe | You will have to explain why the low precision needs less epochs, and if the time gain is only due to this. | Good point. Well this is due to the EarlyStopping mechanism. I will add that to the graph as an annotation | |||||||||||||||||||||
40 | 1 | 28 | Christophe | If would recommend including these references in footnote in the slides where they are cited. Conversely, you could list here the most important items from the bibliography you have covered in the past months... | ||||||||||||||||||||||
41 | 1 | 10 | Silvui | This slide is definitely incomplete. When putting references in a slide, I would add them as a footnote to the page. You need to be more complete in your presentation of the state of the art, and in general, you should try to constantly follow if there are new works appearing that use low/mixed-precision computing for training acceleration. | OK | |||||||||||||||||||||
42 | 1 | 12 | Silvui | You should be clearer about the fact that you are quantizing to a lower precision; you should have a slide in the beginning about quantization, since lowering the numerical precision is the main way you want to accelerate training in your work | Not sure if still applicable | |||||||||||||||||||||
43 | 1 | 13 & 16 | Silvui | You should do more experiments with more networks and datasets. Why are those theoretical results and not experimental results? What is the theory behind them? You talk about quantizing activation functions, but you never say which activation functions are you quantizing in your experiments (e.g. GeLU, tanh, sigmoid?). You should also run the static and dynamic quantization experiments on the same networks and datasets, since this would allow you to compare the two approaches. You also need a slide where you compare the two approaches (static vs dynamic) and you need to discuss the possible advantages and disadvantages of both. | ||||||||||||||||||||||
44 | 1 | Wassim | Add comparison slide between static and dynamic quantization | |||||||||||||||||||||||
45 | 1 | 19 | Wassim | Add a second experiment for ResNet | ||||||||||||||||||||||
46 | 2 | 15 | Wassim | Add Literature review | ||||||||||||||||||||||
47 | 2 | 25 | Wassim | Add annotation on the test accuracy graph | ||||||||||||||||||||||
48 | 2 | 29 | Wassim | Add better graphs | ||||||||||||||||||||||
49 | 2 | 5 | Wassim | Add applications of DL in the space below | ||||||||||||||||||||||
50 | 2 | 6 | Wassim | increase images slide | ||||||||||||||||||||||
51 | 2 | 7 | Wassim | Remove the bracket and the equation below | ||||||||||||||||||||||
52 | 2 | 7 | Wassim | Add input to layers notation | ||||||||||||||||||||||
53 | 2 | 8 | Wassim | Put the formula under the big box instead of under the layer naem | ||||||||||||||||||||||
54 | 2 | 8 | Wassim | a l -1 remove paranthesis | ||||||||||||||||||||||
55 | 2 | 8 | Wassim | Specify that there are multiple types of layers | ||||||||||||||||||||||
56 | 2 | 8 | Wassim | Say that W & b are weights and biases | ||||||||||||||||||||||
57 | 2 | 8 | Wassim | Decouple the layers and activation functions | ||||||||||||||||||||||
58 | 2 | 9 | Wassim | In the derivative, put the weight update | ||||||||||||||||||||||
59 | 2 | 9 | Wassim | Rotate the images and transition from forward & backward instead of 2 images | ||||||||||||||||||||||
60 | 2 | 10 | Wassim | Redundant, add it to slide 9 | ||||||||||||||||||||||
61 | 2 | 11-12 | Wassim | Merge them, | To be discussed | |||||||||||||||||||||
62 | 2 | 12 | Wassim | Fix wording | ||||||||||||||||||||||
63 | 2 | 12 | Wassim | Add leanAI partners | ||||||||||||||||||||||
64 | 2 | 15 | Wassim | How do you position yourself in the state of the art | ||||||||||||||||||||||
65 | 2 | 16 | Wassim | After literature review add idea of what you are doing | ||||||||||||||||||||||
66 | 2 | all | Wassim | Move from quantization to uniform precision switch | ||||||||||||||||||||||
67 | 2 | all | Wassim | Remove software simulation | ||||||||||||||||||||||
68 | 2 | 19 | Wassim | Move image from 20 to 19 | ||||||||||||||||||||||
69 | 2 | 21 | Wassim | Can be removed and just said on slide 20 | ||||||||||||||||||||||
70 | 2 | 23 | Wassim | Change text | ||||||||||||||||||||||
71 | 2 | 25 | Wassim | Remove arrows | ||||||||||||||||||||||
72 | 2 | 28 | Wassim | Low Accuracy instead of lwo precision | ||||||||||||||||||||||
73 | 2 | All | Wassim | Dynamic Quantization -> Dynamic Precision Switching | ||||||||||||||||||||||
74 | 2 | 27 | Wassim | All some text about the generate code | ||||||||||||||||||||||
75 | 2 | 31 | Wassim | Remove the summary | I still feel like that it's good to have it | |||||||||||||||||||||
76 | ||||||||||||||||||||||||||
77 | ||||||||||||||||||||||||||
78 | ||||||||||||||||||||||||||
79 | ||||||||||||||||||||||||||
80 | ||||||||||||||||||||||||||
81 | ||||||||||||||||||||||||||
82 | ||||||||||||||||||||||||||
83 | ||||||||||||||||||||||||||
84 | ||||||||||||||||||||||||||
85 | ||||||||||||||||||||||||||
86 | ||||||||||||||||||||||||||
87 | ||||||||||||||||||||||||||
88 | ||||||||||||||||||||||||||
89 | ||||||||||||||||||||||||||
90 | ||||||||||||||||||||||||||
91 | ||||||||||||||||||||||||||
92 | ||||||||||||||||||||||||||
93 | ||||||||||||||||||||||||||
94 | ||||||||||||||||||||||||||
95 | ||||||||||||||||||||||||||
96 | ||||||||||||||||||||||||||
97 | ||||||||||||||||||||||||||
98 | ||||||||||||||||||||||||||
99 | ||||||||||||||||||||||||||
100 |