INDEX
Explanations
adjectives describing the intensity or significance of potential consequences or impacts
phrases indicating the impact or consequences of actions or events
New Auto-Interp
Negative Logits
erville
-0.79
Franch
-0.77
mant
-0.73
etsk
-0.72
uese
-0.71
dit
-0.69
byn
-0.68
cad
-0.65
bast
-0.63
acan
-0.63
POSITIVE LOGITS
outwe
1.27
negligible
1.21
outweigh
1.16
immense
1.13
undeniable
1.09
manifold
1.07
limitless
1.02
minimized
0.99
minimal
0.98
diminished
0.98
Activations Density 0.290%