INDEX
Explanations
sentences related to legal actions or criminal activities
punctuation marks or sentence terminators, indicating the conclusion of statements
New Auto-Interp
Negative Logits
instinct
-0.75
pudding
-0.74
calming
-0.73
powering
-0.70
impuls
-0.70
facult
-0.69
empowerment
-0.69
perspect
-0.69
meddling
-0.68
lips
-0.67
POSITIVE LOGITS
Each
1.68
Among
1.68
These
1.66
Those
1.59
Others
1.51
None
1.50
Included
1.46
Among
1.43
Some
1.43
Including
1.40
Activations Density 0.543%