INDEX
Explanations
names of specific locations, possibly related to politics or events happening in those locations
references to the place or entity "Samar"
New Auto-Interp
Negative Logits
RAFT
-1.04
ered
-0.82
spring
-0.80
ORN
-0.77
sheet
-0.73
puff
-0.72
draft
-0.71
cember
-0.70
urden
-0.70
ORED
-0.70
POSITIVE LOGITS
itans
1.09
itan
1.03
Samar
0.93
pling
0.84
eness
0.78
agi
0.76
pha
0.75
elines
0.72
autions
0.72
anthrop
0.71
Activations Density 0.009%