INDEX
Explanations
instances of the word "Pog" or variations of it
references to specific individuals and the Pittsburgh Penguins
New Auto-Interp
Negative Logits
ifted
-0.79
owship
-0.76
mere
-0.76
ameda
-0.73
usted
-0.73
irable
-0.72
izons
-0.70
semble
-0.69
allow
-0.69
ARP
-0.66
POSITIVE LOGITS
manship
0.92
insula
0.89
OTUS
0.82
hew
0.75
ongyang
0.73
Pog
0.72
tp
0.71
Nieto
0.69
sburg
0.67
lyak
0.67
Activations Density 0.038%