INDEX
Explanations
mentions of dolphins
references to dolphins
New Auto-Interp
Negative Logits
eners
-0.85
ablishment
-0.82
rooms
-0.81
ãĤ¤ãĥĪ
-0.71
rade
-0.70
ãĥ´ãĤ¡
-0.70
ride
-0.67
istries
-0.65
ielding
-0.65
cation
-0.63
POSITIVE LOGITS
dolphin
0.92
iform
0.83
arium
0.81
dolphins
0.77
patch
0.74
Seal
0.70
bone
0.70
odon
0.69
agraph
0.68
Swim
0.68
Activations Density 0.029%