INDEX
Explanations
references to Disney and dimensions in a context that indicates their significance in various scenarios
New Auto-Interp
Negative Logits
Optimus
-0.56
<<<<<<<<<<<<<<
-0.52
SIT
-0.52
Marissa
-0.52
Psych
-0.52
HON
-0.51
dat
-0.50
propto
-0.50
tr
-0.49
ara
-0.49
POSITIVE LOGITS
faſt
0.61
cauſe
0.60
myſelf
0.59
avoient
0.58
ainfi
0.58
desmotivaciones
0.56
purpoſe
0.55
ſtate
0.54
pleaſure
0.53
themſelves
0.52
Activations Density 0.203%