INDEX
Explanations
instances where a particular "thing" is emphasized within a context
phrases emphasizing important or notable statements
New Auto-Interp
Negative Logits
inav
-0.76
ULT
-0.72
cul
-0.71
oÄŁ
-0.70
lems
-0.70
DOS
-0.67
bush
-0.67
ONSORED
-0.67
JV
-0.65
vez
-0.65
POSITIVE LOGITS
happens
0.78
iverse
0.76
rued
0.74
happened
0.70
separates
0.68
Valiant
0.65
afort
0.64
counts
0.63
happening
0.62
touches
0.60
Activations Density 0.032%