INDEX
Explanations
information about various news stories or political events, possibly with a focus on controversy or conflict
instances of a specific character or symbol in text
New Auto-Interp
Negative Logits
avoidance
-0.67
reflex
-0.64
matic
-0.64
lapse
-0.64
apes
-0.62
pigeon
-0.61
avenues
-0.59
ulative
-0.59
pid
-0.59
Antar
-0.59
POSITIVE LOGITS
ï¸ı
1.01
ever
0.87
arthed
0.84
âĶĢâĶĢâĶĢâĶĢ
0.83
âĶĢâĶĢ
0.81
conom
0.80
ield
0.79
âĤ¬
0.78
uthor
0.77
reci
0.74
Activations Density 0.187%