The most recent Pfam release (30.0) contains 16,306 families (http://pfam.xfam.org/). Among these families, 5,423 (33.3%) families are assigned to 595 clans. Among these clans, 254 (42.7%) clans have 5 or more families and 20 have over 40 families.


According to (Finn, et al., 2016), the graphical representation of clans with over 40 members has been a challenge. The most recent Pfam release attempted to address the issue using a new JavaScript clanviewer (Finn, et al., 2016) to present the families of a clan. Our experience is that such a network-like flash animated graph still does not work for well those large clans. With pHMM-tree, we have built phylogenies for the 254 clans with at least 5 families. All the trees could be viewed in our website. In the following, we selected two example clans to show the phylogenies.


Pfam ClanClan NameClan Size/Members
CL0001EGF15
CL0003SAM5
CL0004Concanavalin25
CL0007KH7
CL0010SH310
CL0011Ig28
CL0012Histone15
CL0013Beta-lactamase6
CL0014Glutaminase_I14
CL0015MFS24
CL0016PKinase23
CL0020TPR132
CL0021OB61
CL0022LRR11
CL0023P-loop_NTPase199
CL0025His_Kinase_A9
CL0026CU_oxidase11
CL0027RdRP9
CL0028AB_hydrolase66
CL0029Cupin57
CL0030Ion_channel7
CL0031Phosphatase10
CL0032Dim_A_B_barrel21
CL0033POZ6
CL0034Amidohydrolase14
CL0035Peptidase_MH16
CL0036TIM_barrel57
CL0037Lysozyme15
CL0039HUP26
CL0040tRNA_synt_II10
CL0041Death6
CL0042Flavoprotein7
CL0044Ferritin21
CL0046Thiolase13
CL0048LolA_LolB14
CL0049Tudor17
CL0050HotDog13
CL0051NTF228
CL0052NTN14
CL00534H_Cytokine25
CL0054Knottin_113
CL0055Viral_ssRNA_CP26
CL0056C_Lectin11
CL0057Met_repress31
CL0058Glyco_hydro_tim53
CL00596_Hairpin27
CL0060DNA_clamp10
CL0061PLP_aminotran15
CL0062APC20
CL0063NADP_Rossmann184
CL0064CPA_AT13
CL0065Cyclin10
CL0066Trefoil20
CL0067SIS5
CL0070ACT11
CL0072Ubiquitin42
CL0073P53-like8
CL0074Matrix6
CL0075Defensin9
CL0076FAD_Lum_binding5
CL0079Cystine-knot11
CL0080Beta-tent7
CL0081MBD-like7
CL0082MIF6
CL0083Omega_toxin25
CL0084ADP-ribosyl12
CL0085FAD_DHS9
CL0088Alk_phosphatase10
CL0089GlnB-like9
CL0093Peptidase_CD8
CL0098SPOUT12
CL0101PELOTA5
CL0103Gal_mutarotase17
CL0104Glyoxalase8
CL0105Hybrid21
CL01066PGD_C13
CL0107KOW6
CL0108Actin_ATPase31
CL0109CDA18
CL0110GT-A46
CL0111GT-C19
CL0112Yip16
CL0113GT-B38
CL0114HMG-box7
CL0115Steroid_dh6
CL0116Calycin36
CL0117uPAR_Ly6_toxin7
CL0118Ribokinase5
CL0121Cystatin8
CL0123HTH254
CL0124Peptidase_PA25
CL0125Peptidase_CA65
CL0126Peptidase_MA58
CL0127ClpP_crotonase10
CL0128vWA-like17
CL0129Peptidase_AA14
CL0131DoxD-like9
CL0135Arrestin_N-like6
CL0136Plasmid-antitox9
CL0137HAD22
CL0141MtN3-like5
CL0142Membrane_trans7
CL0143B_Fructosidase6
CL0144Periplas_BP9
CL0145Golgi-transport9
CL0149CoA-acyltrans8
CL0151PK_TIM11
CL0153dUTPase7
CL0154C211
CL0158GH_CE7
CL0159E-set82
CL0161GAF12
CL0163Calcineurin8
CL0165Cache23
CL0167Zn_Beta_Ribbon74
CL0169Rep10
CL0170Peptidase_MD7
CL0172Thioredoxin57
CL0173STIR5
CL0174TetR_C13
CL0175TRASH12
CL0177PBP25
CL0178PUA10
CL0179ATP-grasp21
CL0181ABC-212
CL0182IT19
CL0183PAS_Fold13
CL0184DMT21
CL0186Beta_propeller71
CL0192GPCR_A38
CL0193MBB65
CL0196DSRM5
CL0197GME5
CL0198HHH20
CL0199DPBB5
CL0202GBD32
CL0203CBD5
CL0204Adhesin15
CL0208UBC9
CL0209Bet_V_1_like14
CL0212SNARE7
CL0214UBA11
CL0219RNase_H48
CL0220EF_hand22
CL0221RRM18
CL0222MviN_MATE7
CL0226M6PR5
CL0229RING38
CL0230HO5
CL0231MazG5
CL0236PDDEXK123
CL0237HD_PDEase10
CL0246ISOCOT_Fold12
CL02472H9
CL0254THDP-binding9
CL0255ATP_synthase13
CL0257Acetyltrans35
CL0260NTP_transf23
CL0261NUDIX5
CL0263His-Me_finger22
CL0264SGNH_hydrolase12
CL0265HIT8
CL0266PH40
CL0268Pec_lyase-like22
CL0270Iso_DH5
CL0271F-box5
CL0274WRKY-GCM16
CL0277FAD-oxidase_C6
CL0280PIN18
CL0286GCS8
CL0287Transthyretin16
CL0291KNTase_C8
CL0292LysE17
CL0293CDC6
CL0295Vps5117
CL0304CheY5
CL0306HeH6
CL0307FUSC9
CL0310DinB8
CL0315Gx_transp11
CL0316Acyl_transf_37
CL0317Multiheme_cytos15
CL0318Cytochrome-c13
CL0319SHS27
CL0320PepSY7
CL0322RND_permease5
CL0324Homing_endonuc7
CL0327Pilus11
CL03282heme_cytochrom9
CL0329S517
CL0330AVL95
CL0331EpsM7
CL0333gCrystallin6
CL0334THBO-biosyn6
CL0335FumRed-TM5
CL0336FMN-binding9
CL0339PFL-like6
CL0342TolB_N6
CL0343MHC5
CL03444Fe-4S26
CL0351CHCH7
CL0352EsxAB5
CL0361C2H2-zf29
CL0362RAMPS-Cas5-like7
CL0363H-int7
CL0366JAB6
CL0368PhosC-NucP15
CL0369GHD20
CL0372Hy-ly_N5
CL0373Phage-coat6
CL0375Transporter12
CL0381Metallo-HOrase8
CL0382DNA-mend8
CL0390zf-FYVE-PHD10
CL0400GG-leader5
CL0401AsmA-like6
CL0413Toprim-like7
CL0434Sialidase11
CL0437EF-G_C7
CL0450FimbA6
CL0451FnI-like5
CL0459BRCT-like5
CL0461Metallothionein6
CL0465Ank7
CL0466PDZ-like5
CL0469l-integrase_N6
CL0475Cyclophil-like6
CL0479PLD8
CL0482Prolamin5
CL0483PreATP-grasp7
CL0486Fer25
CL0487FKBP6
CL0497GST_C8
CL0498Nribosyltransf6
CL0499O-anti_assembly5
CL0500Glycine-zipper7
CL0504Phage_barrel8
CL0511Retroviral_zf6
CL0523GAG-polyprotein7
CL0526SUKH7
CL0527Sm-like5
CL0533PRTase-like6
CL0551BCLiA5
CL0556PapD-like6
CL0559CBD9-like5
CL0567Phage_TACs14
CL0568Man_lectin5
CL0569Phage_TTPs9
CL057130K_movement7
CL0575EFTPs7
CL0594DUF17355
CL0603AA_dh_N5
CL0604post-AAA5
CL0613Terp_synthase5
       FIRST      <<        >>     END        JumpTo    Page   JUMP

The first example is the Pfam clan Glutaminase_I (CL0014), which has 14 families. shows two representations of the family relationship of the clan. The left is the clanviewer network graph, with circles representing families and edges representing their relationships. The size of each circle represents the size of the family and the width of the edge represents the similarity between two families. The graph can be dragged to rearrange the layout and can be zoomed in or out. The right side of the figure shows the pHMM phylogeny, which is a more classic way for presenting evolutionary relationship of different entities, such as species, genes, and here protein families. The two representations both present two major groups of families, but the phylogeny is clearly much easier to understand and interpret. By nature, it also presents the relationship among different families in a hierarchical way. In addition, all families are included rather than left out in the graph. For example, the DUF4159 family is not connected to any other families in the network graph, but is clustered with its closest relative ThuA family in the phylogeny graph.



The second example is the largest Pfam clan CL0123 (Helix-turn-helix or HTH), which contains 254 families. As shown in Figure S8, the network graph is not possible to visualize in a global view as there are too many nodes and edges stacked on top of each other. Although clanviewer allows one to interactively view the graph (i.e., zoom in/out or move nodes by dragging), we believe the classic phylogenetic tree is much easier to view to capture the relationship among such as large number of families. As an example, we highlighted the 27 DUF (domain of unknown function) families in red in the phylogeny. One can quickly locate these families in the phylogeny and identify their closest neighbors.