INDEX
Explanations
proper nouns or named entities
the word "was" in various contexts
New Auto-Interp
Negative Logits
IMAGES
-0.76
onic
-0.69
Submit
-0.67
Current
-0.67
has
-0.66
arta
-0.66
LAT
-0.63
Make
-0.62
stand
-0.62
Extend
-0.61
POSITIVE LOGITS
unclear
1.19
raining
1.06
impossible
0.99
inevitable
0.95
easier
0.89
imperative
0.88
evident
0.88
worth
0.87
rumored
0.87
fortunate
0.87
Activations Density 0.089%