INDEX
Explanations
evaluative language related to experiences
New Auto-Interp
Negative Logits
currently
-0.99
atualmente
-0.90
attualmente
-0.87
actualmente
-0.85
currently
-0.85
presently
-0.85
obecnie
-0.85
actuellement
-0.85
derzeit
-0.81
Currently
-0.80
POSITIVE LOGITS
was
0.83
楽しかった
0.82
Overall
0.75
Overall
0.73
面白かった
0.71
Highlights
0.71
overall
0.70
пришлось
0.68
ended
0.66
despite
0.66
Activations Density 0.800%