INDEX
Explanations
references to political divisions and discussions
references to political contexts or situations
New Auto-Interp
Negative Logits
estamp
-0.76
Scores
-0.67
Investig
-0.66
ettings
-0.66
Fighters
-0.65
Nanto
-0.64
itar
-0.63
OD
-0.62
Og
-0.62
odor
-0.62
POSITIVE LOGITS
aisle
1.31
stalls
0.95
seat
0.90
seats
0.86
nuts
0.82
seating
0.80
stall
0.80
shelves
0.77
seat
0.76
wal
0.75
Activations Density 0.004%