Learning is a contentious process in which the brain adaptively changes its activity to improve performance. Outcome representation by cortical networks during learning has been the focus of many research groups in the past decade, where most work has been done regarding binary outcome – success or failure in performing a task. Still, it is plausible to assume that neuronal networks use a richer representation and thus report a continuous value of outcome to achieve optimal performance faster. To test this hypothesis, we analyzed the neuronal activity of Pyramidal Cells in layer 2-3 of the primary motor cortex (M2), recorded from mice performing a hand-reach task. In this setup, mice are trained to reach grab and consume a (plain) food pellet. Once mice are trained, they are introduced with flavored pellets - sweet and bitter, along with the plain ones used for training. Here we introduce a multivariate analysis pipeline for the detection of sub-populations encoding binary outcome (success/failure) and value of outcome (flavor). Our results confirm the existence of a sub-population of cells reporting binary outcomes well after movement is over. Once flavors are introduced, the encoding of binary outcome changes but does not vanish. Representation of flavor emerges with exposure to flavor, where at first novelty is reported and then the value of the outcome (aversive vs. tasty) is reported. Interestingly, flavor is also reported during the preparatory time segment, before the go-cue. Overall, our results confirm the hypothesis of a rich representation for outcome, not only as a binary success/failure value after movement is complete, but as a continuous variable reported during the whole trial period depending on the experimental context.