INDEX
Explanations
statements related to legal or financial issues
followed by actions or opinions
official statements and past actions
New Auto-Interp
Negative Logits
ftagPool
-0.58
almost
-0.57
Almost
-0.56
almost
-0.53
意外と
-0.53
cheap
-0.52
måske
-0.52
sneaky
-0.52
screamed
-0.52
Almost
-0.51
POSITIVE LOGITS
verständlich
0.82
neceff
0.82
eraard
0.81
fubject
0.80
Majefty
0.79
neceffary
0.77
ſeveral
0.76
naturlig
0.74
regrettable
0.74
itſelf
0.74
Activations Density 0.271%