INDEX
Explanations
words related to change or transformation
terms related to transformation or change
New Auto-Interp
Negative Logits
bis
-0.73
PLIED
-0.70
avering
-0.60
Found
-0.60
WARN
-0.59
bast
-0.59
ept
-0.58
oiler
-0.58
llah
-0.58
blers
-0.56
POSITIVE LOGITS
into
1.21
into
1.10
ively
1.05
INTO
0.98
Into
0.87
ational
0.81
ative
0.79
ives
0.78
atted
0.75
ELF
0.72
Activations Density 0.072%