INDEX
Explanations
proper nouns related to people or places, particularly those including the word "Samar"
references to geographical locations and associated entities
New Auto-Interp
Negative Logits
RAFT
-0.77
netflix
-0.75
nut
-0.74
puff
-0.74
rar
-0.72
rams
-0.70
spring
-0.68
bending
-0.68
oola
-0.67
ORN
-0.66
POSITIVE LOGITS
itans
1.00
Samar
0.90
itan
0.88
pling
0.85
eness
0.72
Kou
0.71
Strip
0.70
ãĥ£
0.69
©
0.67
µ
0.66
Activations Density 0.047%