INDEX
Explanations
references to historical events related to the atomic bombings of Hiroshima and Nagasaki
New Auto-Interp
Negative Logits
amores
-0.37
unehmen
-0.36
cuidado
-0.31
jeito
-0.31
novios
-0.31
povr
-0.30
ajan
-0.30
shapes
-0.29
ne
-0.29
livery
-0.29
POSITIVE LOGITS
faſt
0.92
ſever
0.85
deſt
0.83
ſte
0.78
enterOuterAlt
0.77
ſelf
0.77
AsUp
0.76
diſt
0.75
ſta
0.75
leaſt
0.74
Activations Density 0.018%