INDEX
Explanations
terms related to forgetting and sponsorship
New Auto-Interp
Negative Logits
#+#
-0.79
хьтан
-0.75
enfans
-0.71
Jeografia
-0.69
мәкал
-0.65
舺
-0.63
oredCriteria
-0.63
Italij
-0.63
uſed
-0.63
rboles
-0.62
POSITIVE LOGITS
forgotten
0.89
forget
0.80
forgot
0.72
forget
0.68
forgetting
0.65
forgotten
0.64
Forget
0.60
Forgotten
0.60
Forget
0.57
knowledge
0.55
Activations Density 0.255%