首页->论文

发表年份:

·1997    ·2005    ·2012

·1998    ·2006    ·2013

·2000    ·2007    ·2014

·2001    ·2008    ·2015

·2002    ·2009

·2003    ·2010

·2004    ·2011

  2015年

1、hfq regulates acid tolerance and virulence by responding to acid stress in Shigella flexneri

Guang Yang, Ligui Wang, Yong Wang, Peng Li, Jiangong Zhu, Shaofu Qiu, Rongzhang Hao, Zhihao Wu, Wuju Li, Hongbin Song

Research in Microbiology 2015;166(6):476-485

Abstract:Shigella flexneri is an important etiological agent of bacillary dysentery in developing countries. The Hfq protein is thought to play a major regulatory role in various cellular processes in this organism. However, the roles of Hfq in stress tolerance and virulence in S. flexneri in response to environmental stress have not been fully studied. In this study, hfq was highly expressed when S. flexneri was exposed to low pH. Growth retardation was observed in the hfq deletion mutant at pH values ranging from 5.0 to 7.0 and the survival rate of the mutant strain was reduced by 60% in acidic conditions (pH 3.0) compared with the wild-type strain. Additionally, competitive invasion assays in HeLa cells and lung invasion assays showed that the virulence of the hfq deletion mutant was significantly decreased. An evaluation of the mechanism revealed that, along with the expression of the Type III secretion system genes, acid resistance genes were also increased with acid stress. Interestingly, a statistically strong linear correlation was observed between the expression of hfq and Type III secretion system genes, as well as between hfq and acid resistance genes, under various pH conditions. In this study, we provide evidence that Hfq regulates genes related to acid resistance for survival under acid stress and controls virulence through the positive regulation of Type III secretion systems. Importantly, we propose that hfq is a key factor in maximal adaptation to host acid stress during infection, regulating acid stress tolerance and virulence in response to acid stress in S. flexneri.

Full Text Download:

2、Predicting linear B-cell epitopes using amino acid anchoring pair composition

Weike Shen, Yuan Cao, Lei Cha, Xufei Zhang, Xiaomin Ying, Wei Zhang, Kun Ge, Wuju Li and Li Zhong

BioData Mining. 2015 Apr 29;8:14.

Abstract:BACKGROUND: Accurate identification of linear B-cell epitopes plays an important role in peptide vaccine designs, immunodiagnosis, and antibody productions. Although several prediction methods have been reported, unsatisfied accuracy has limited the broad usages in linear B-cell epitope prediction. Therefore, developing a reliable model with significant improvement on prediction accuracy is highly desirable. RESULTS: In this study, we developed a novel model for prediction of linear B-cell epitopes, APCpred, which was derived from the combination of amino acid anchoring pair composition (APC) and Support Vector Machine (SVM) methods. Systematic comparisons with the existing prediction models demonstrated that APCpred method significantly improved the prediction accuracy both in fivefold cross-validation of training datasets and in independent blind datasets. In the fivefold cross-validation test with Chen872 dataset at window size of 20, APCpred achieved AUC of 0.809 and accuracy of 72.94%, which was much more accurate than the existing models, e.g., Bayesb, Chen's AAP methods and the enhanced combination method of AAP with five AP scales. For the fivefold cross-validation test with ABC16 dataset, APCpred achieved an improved AUC of 0.794 and ACC of 73.00% at window size of 16, and attained an AUC of 0.748 and ACC of 67.96% on Blind387 dataset after being trained with ABC16 dataset. Trained with Lbtope_Confirm dataset, APCpred achieved an increased Acc of 55.09% on FBC934 dataset. Within sequence window sizes from 12 to 20, APCpred final model on homology-reduced dataset achieved an optimal AUC of 0.748 and ACC of 68.43% in fivefold cross-validation at the window size of 20. CONCLUSION: APCpred model demonstrated a significant improvement in predicting linear B-cell epitopes using the features of amino acid anchoring pair composition (APC). Based on our study, a webserver has been developed for on-line prediction of linear B-cell epitopes, which is a free access at: http:/ccb.bmi.ac.cn/APCpred/.

Full Text Download:

3、Large-Scale Brain Network Coupling Predicts Total Sleep Deprivation Effects on Cognitive Capacity

Yu Lei, Yongcong Shao, Lubin Wang, Tianye Zhai, Feng Zou, Enmao Ye, Xiao Jin, Wuju Li, Jianlin Qi, Zheng Yang

PLoS One. 2015 July 28; 10(7):e0133959

Abstract:Interactions between large-scale brain networks have received most attention in the study of cognitive dysfunction of human brain. In this paper, we aimed to test the hypothesis that the coupling strength of large-scale brain networks will reflect the pressure for sleep and will predict cognitive performance, referred to as sleep pressure index (SPI). Fourteen healthy subjects underwent this within-subject functional magnetic resonance imaging (fMRI) study during rested wakefulness (RW) and after 36 h of total sleep deprivation (TSD). Self-reported scores of sleepiness were higher for TSD than for RW. A subsequent working memory (WM) task showed that WM performance was lower after 36 h of TSD. Moreover, SPI was developed based on the coupling strength of salience network (SN) and default mode network (DMN). Significant increase of SPI was observed after 36 h of TSD, suggesting stronger pressure for sleep. In addition, SPI was significantly correlated with both the visual analogue scale score of sleepiness and the WM performance. These results showed that alterations in SN-DMN coupling might be critical in cognitive alterations that underlie the lapse after TSD. Further studies may validate the SPI as a potential clinical biomarker to assess the impact of sleep deprivation.

Full Text Download:

4、Altered Superficial Amygdala–Cortical Functional Link in Resting State After 36 Hours of Total Sleep Deprivation

Yu Lei, Yongcong Shao, Lubin Wang, Enmao Ye, Xiao Jin, Feng Zou, Tianye Zhai, Wuju Li, and Zheng Yang

Journal of Neuroscience Research 2015 Dec;93(12):1795-1803

Abstract:The superficial amygdala (SFA) is important in human emotion/affective processing via its strong connection with other limbic and cerebral cortex for receptive and expressive emotion processing. Few studies have investigated the functional connectivity changes of the SFA under extreme conditions, such as prolonged sleep loss, although the SFA showed a distinct functional connectivity pattern throughout the brain. In this study, resting-state functional magnetic resonance imaging (rs-fMRI) was employed to investigate the changes of SFA-cortical functional connectivity after 36 hr of total sleep deprivation (TSD). Fourteen healthy male volunteers aged 25.9?±?2.3 years (range 18-28 years) enrolled in this within-subject crossover study. We found that the right SFA showed increased functional connectivity with the right medial prefrontal cortex (mPFC) and decreased functional connectivity with the right dorsal posterior cingulate cortex (dPCC) in the resting brain after TSD compared with that during rested wakefulness. For the left SFA, decreased connectivity with the right dorsal anterior cingulate cortex (dACC) and right dPCC was found. Further regression analysis indicated that the functional link between mPFC and SFA significantly correlated with the Profile of Mood State scores. Our results suggest that the amygdala cannot be treated as a single unit in human neuroimaging studies and that TSD may alter the functional connectivity pattern of the SFA, which in turn disrupts emotional regulation.

Full Text Download:

5、Different Effects of p52SHC1 and p52SHC3 on the Cell Cycle of Neurons and Neural Stem Cells

NING TANG,DAN LYU, TAO LIU,FANGJIN CHEN, SHUQIAN JING, TIANYU HAO, AND SHAOJUN LIU

J Cellular Physiology 2016 Jan;231(1):172-180. Epub 28 SEP 2015.

Abstract:SHC3 is exclusively expressed in postmitotic neurons, while SHC1 is found in neural stem cells and neural precursor cells but absent in mature neurons. In this study, we discovered that suppression of p52SHC1 expression by RNA interference resulted in proliferation defects in neural stem cells, along with significantly reduced protein levels of cyclin E and cyclin A. At the same time, p52SHC3 RNAi caused cell cycle re-entry (9.54% in S phase and 5.70% in G2-M phase) in primary neurons with significantly up-regulated expression of cyclin D1, cyclin E, cyclin A, CDK2, and phosphorylated CDK2. When p52SHC3 was overexpressed, the cell cycle of neural stem cells was arrested with reduced protein levels of cyclin D1, cyclin E, and cyclin A, while overexpression of p52SHC1 did not result in significant changes in postmitotic neurons. Our results indicate that p52SHC3 plays an important role in maintaining the mitotic quiescence of neurons, while p52SHC1 regulates the proliferation of neural stem cells.

Full Text Download:

6、sRNATarBase 3.0: an updated database for sRNA-target interactions in bacteria.

Jiang Wang, Tao Liu, Bo Zhao, Qixuan Lu, Zheng Wang, Yuan Cao, and Wuju Li

Nucleic Acids Res 2015 Oct 25. pii: gkv1127.

Abstract:Bacterial sRNAs are a class of small regulatory RNAs of about 40-500 nt in length; they play multiple biological roles through binding to their target mRNAs or proteins. Therefore, elucidating sRNA targets is very important. However, only targets of a few sRNAs have been described. To facilitate sRNA functional studies such as developing sRNA target prediction models, we updated the sRNATarBase database, which was initially developed in 2010. The new version (recently moved to http://ccb1.bmi.ac.cn/srnatarbase/) contains 771 sRNA-target entries manually collected from 213 papers, and 23 290 and 11 750 predicted targets from sRNATarget and sTarPicker, respectively. Among the 771 entries, 475 and 17 were involved in validated sRNA-mRNA and sRNA-protein interactions, respectively, while 279 had no reported interactions. We also presented detailed information for 316 binding regions of sRNA-target mRNA interactions and related mutation experiments, as well as new features, including NCBI sequence viewer, sRNA regulatory network, target prediction-based GO and pathway annotations, and error report system. The new version provides a comprehensive annotation of validated sRNA-target interactions, and will be a useful resource for bacterial sRNA studies.

Full Text Download:

7、人miRNA-埃博拉病毒相互作用的生物信息学研究

刘涛,王江,王正,郑晓飞,李伍举

军事医学 2015,39(1):6-11

摘要:目的:初步探讨人微小RNA(miRNA)与埃博拉病毒基因组5′尾标(trailer)序列相互作用,为防治埃博拉病毒提供可能的靶向miRNA。方法运用Pita和RNAhybrid软件预测与埃博拉病毒5′尾标序列相互作用的人miRNA,并对其进行注释和分析。结果与结论发现人miRNA可能与埃博拉病毒5′尾标序列存在复杂的相互作用。根据以前关于宿主miRNA与病毒基因组相互作用的报道,我们认为,人miRNA与埃博拉病毒基因组5′尾标的相互作用可能会影响埃博拉病毒在人体内的复制以及人体细胞的正常功能。该 研究将为埃博拉病毒的防治提供新的思考。

Full Text Download:

  2014年

1、Identification of a Tumor Suppressive Human Specific MicroRNA within the FHIT Tumor Suppressor Gene

Baocheng Hu, Xiaomin Ying, Jian Wang, Jittima Piriyapongsa, I. King Jordan, Jipo Sheng, Fang Yu, Po Zhao, Yazhuo Li, Hongyan Wang, Wooi Loon Ng, Shuofeng Hu, Xiang Wang, Chenguang Wang, Xiaofei Zheng, Wuju Li, Walter J. Curran, and Ya Wang

Cancer research,2014; doi: 10.1158/0008-5472.CAN-13-3279

Abstract:Loss or attenuated expression of the tumor-suppressor gene FHIT is associated paradoxically with poor progression of human tumors. Fhit promotes apoptosis and regulates reactive oxygen species; however, the mechanism by which Fhit inhibits tumor growth in animals remains unclear. In this study, we used a multidisciplinary approach based on bioinformatics, small RNA library screening, human tissue analysis, and a xenograft mouse model to identify a novel member of the miR-548 family in the fourth intron of the human FHIT gene. Characterization of this human-specific microRNA illustrates the importance of this class of microRNAs in tumor suppression and may influence interpretation of Fhit action in human cancer.

Full Text Download:

2、The potential biomarker panels for identification of Major Depressive Disorder (MDD) patients with and without early life stress (ELS) by metabonomic analysis.

Xinghua Ding, Shuguang Yang, Wuju Li, Yong Liu, Zhiguo Li, Yan Zhang, Lingjiang Li,Shaojun Liu

PLoS ONE,2014,9(5): e97479.

Abstract:OBJECTIVE: The lack of the disease biomarker to support objective laboratory tests still constitutes a bottleneck in the clinical diagnosis and evaluation of major depressive disorder (MDD) and its subtypes. We used metabonomic techniques to screen the diagnostic biomarker panels from the plasma of MDD patients with and without early life stress (ELS) experience. METHODS: Plasma samples were collected from 25 healthy adults and 46 patients with MDD, including 23 patients with ELS and 23 patients without ELS. Furthermore, gas chromatography/mass spectrometry (GC/MS) coupled with multivariate statistical analysis was used to identify the differences in global plasma metabolites among the 3 groups. RESULTS: The distinctive metabolic profiles exist either between healthy subjects and MDD patients or between the MDD patients with ELS experience (ELS/MDD patients) and the MDD patients without it (non-ELS/MDD patients), and some diagnostic panels of feature metabolites' combination have higher predictive potential than the diagnostic panels of differential metabolites. CONCLUSIONS: These findings in this study have high potential of being used as novel laboratory diagnostic tool for MDD patients and it with ELS or not in clinical application.

Full Text Download:

3、大肠杆菌基因组水平蛋白质-RNA相互作用初步研究

徐淞,陈垚文,应晓敏,付汉江,田宝磊,宋宜,郑晓飞,李伍举

军事医学,2014,38(8):612-616

摘要:初步研究大肠杆菌中基因组水平的蛋白质-RNA相互作用(protein-RNA interactions,PRI)。方法 通过RNA酶消化细菌裂解液,提取与蛋白质相互作用的RNA片段,构建cDNA文库,进行高通量测序,并通过生物信息学分析获得与蛋白质结合的转录本。结果 获得了与蛋白质结合的3193条转录本,涉及2234个mRNA、47个sRNA(small regulatory RNAs)、39个tRNA、11个rRNA以及862个基因间区(intergenic region, IGR)。结论 初步获得大肠杆菌中与蛋白质相互作用的转录本信息,为进一步开展PRI研究提供了支持。

Full Text Download:

  2013年

1、Optimisation of reverse transcription loop-mediated isothermal amplification assay for the rapid detection of pandemic (H1N1) 2009 virus

Xin Cai, Zha Lei, Wen-liang Fu, Zhe-yi Zhu, Min-ji Zou, Jie Gao, Yuan-yuan Wang, Min Hong, Jia-xi Wang, Wu-ju Li, Dong-gang Xu

Afr. J. Microbiol. Res. 2013,7:3919-3925

Abstract:Conventional reverse transcriptase polymerase chain reaction (RT-PCR) and optimized of a closed tube reverse-transcription loop-mediated isothermal amplification (RT-LAMP) were used for detection of pandemic (H1N1) 2009 virus and the optimized of a closed tube RT-LAMP methods were compared with the conventional RT-PCR with respect to specificity and sensitivity. In this study, optimized RT-LAMP detected 2 copies of target RNA by visual detection with modified dye. Reaction time, temperature and quantity of each reagent were optimised for the detection of the virus. The sensitivity of detection limit by optimised RT-LAMP was 100 times as that of conventional RT-PCR. Amplification of DNA can be identified by visualization with modified dye, which reduces the cross-contamination caused by opening tube. The sensitivity of visual detection was equivalent to that of electrophoresis analysis. Additionally, the method was specific as no cross-reaction was observed among samples from human blood, Escherichia coli and other related viruses including human seasonal influenza A, subtypes H1N1, H1N2 and H3N2 viruses. These results demonstrate that the optimized RT-LAMP assay for pandemic (H1N1) 2009 virus RNA was a valuable tool with simplicity, rapidity and specificity, as well as its superiority for the screening and surveillance of influenza in developing countries.

Full Text Download:

2、体内与体外RNA-RNA相互作用的比较初探

陆启轩,查磊,李宗城,陈临溪,李伍举,应晓敏

军事医学,2013,37(7):517-520

摘要:目的通过比较in vitro与in vivo的RNA-RNA相互作用(RNA-RNA interaction,RRI),探究通过in vitroRRI推测in vivo RRI的可靠性。方法采用perl语言编写脚本分析酵母转录组水平in vitro RNA的二级结构信息,得到可能的in vitro RRI,再与酵母的in vivo小核仁RNA(snoRNA)-rRNA相互作用进行比较。结果发现in vitrosnoRNA-rRNA相互作用与in vivo snoRNA-rRNA相互作用的重叠率仅为23.42%(26/111);而in vitro测定的snoRNA双链片段与in vivo测定的参与RRI的snoRNA片段重叠率为38.78%(19/49);in vitro测定的rRNA双链片段与in vivo测定的参与RRI的rRNA片段重叠率为80.70%(46/57)。结论 in vitro和in vivo条件下snoRNA-rRNA的相互作用差异很大,提示in vitro条件下测定的snoRNA-rRNA的相互作用不能真实反映它们在in vivo的相互作用。

Full Text Download:

  2012年

1、Computational tools for predicting sRNA targets

Wuju Li, Xiaomin Ying, Lei Cha

Regulatory RNAs in prokaryotes,2011,ISBN 978-3-7091-0217-6:165-177

2、BioSunLAMP:一个用于环介导等温扩增的引物设计软件

查磊,蔡欣,应晓敏,徐东刚,曹源,李伍举

军事医学,2012,36(3):230-233

摘要:目的环介导等温扩增法(loop-mediated isothermal amplification,LAMP)是一种新型等温核酸扩增方法。由于其具有简便、高效、特异性高和成本低等优点,在甲型H1N1流感和肺结核等流行 病检测中得到了广泛应用。在LAMP技术中,关键的起始步骤是设计合适的引物序列。为了让引物设计更加方便与高效,我们开发了LAMP引物设计软件 BioSunLAMP。方法采用Delphi程序设计语言开发了界面友好、便于使用的软件系统。结果经甲型H1N1流感、结核分枝杆菌实验验 证,BioSunLAMP软件设计的引物达到了预期效果。此外,与同类软件相比,BioSunLAMP还具有如下特点:①集引物设计与引物特异性分析于一 体,可以通过本地数据库或远程调用NCBI的相关数据库来检查引物特异性;②支持针对多序列的通用引物与特异引物设计。结论 BioSunLAMP软件的开发,为LAMP技术的普及提供了很好的生物信息学支持。

3、结合RNA的蛋白质位点预测研究

查 磊,应晓敏,李伍举

军事医学,2012,36(6):830-834

4、Predicting sRNAs and their targets in bacteria

Wuju Li, Xiaomin Ying, Qixuan Lu, Linxi Chen

Genomics Proteomics Bioinformatics, 2012,10:276-284

Abstract:Bacterial small RNAs (sRNAs) are an emerging class of regulatory RNAs of about 40-500 nucleotides in length and, by binding to their target mRNAs or proteins, get involved in many biological processes such as sensing environmental changes and regulating gene expression. Thus, identification of bacterial sRNAs and their targets has become an important part of sRNA biology. Current strategies for discovery of sRNAs and their targets usually involve bioinformatics prediction followed by experimental validation, emphasizing a key role for bioinformatics prediction. Here, therefore, we provided an overview on prediction methods, focusing on the merits and limitations of each class of models. Finally, we will present our thinking on developing related bioinformatics models in future.

5、Identification and Expression of Small Non-Coding RNA,L10-Leader, in Different Growth Phases of Streptococcus mutans

Li Xia, Wei Xia, Shaohua Li,Wuju Li,..., Ningsheng Shao, and Bingfeng Chu

Nucleic Acid Therapeutics, 2012,22(3):177-186

Abstract:Streptococcus mutans is one of the major cariogenic bacteria in the oral environment. Small non-coding RNAs (sRNAs) play important roles in the regulation of bacterial growth, stress tolerance, and virulence. In this study, we experimentally verified the existence of sRNA, L10-Leader, in S. mutans for the first time. Our results show that the expression level of L10-Leader was growth-phase dependent in S. mutans and varied among different clinical strains of S. mutans. The level of L10-Leader in S. mutans UA159 was closely related to the pH value, but not to the concentrations of glucose and sucrose in culture medium. We predicted target mRNAs of L10-Leader bioinformatically and found that some of these mRNAs were related to growth and stress response. Five predicted mRNA targets were selected and detected by quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR), and we found that the expression levels of these mRNAs were closely related to the level of L10-Leader at different growth phases of the bacteria. Our results indicate that L10-Leader may play an important role in the regulation of responses in S. mutans, especially during its growth phase and acid adaption response.

6、A modified visual loop-mediated isothermal amplification method for diagnosis and differentiation of main pathogens from Mycobacterium tuberculosis complex

Ming Hong, Lei Zha, Wenliang Fu, Minji Zou, Wuju Li, Donggang Xu

World J Microbiol Biotechnol (2012) 28:523–531

Abstract:This study was aimed to rapidly identify and differentiate two main pathogens of the Mycobacterium tuberculosis complex: Mycobacterium tuberculosis subsp. tuberculosis and Mycobacterium bovis by a modified loop-mediated isothermal amplification (LAMP) assay. The reaction results could be evaluated by naked eye with two optimized closed tube detection methods as follows: adding the modified fluorescence dye in advance into the reaction mix so as to observe the color changes or putting a tinfoil in the tube and adding the SYBR Green I dye on it, then making the dye drop into the bottom of the tube by centrifuge after reaction. The results showed that the two groups of primers used jointly in this assay could successfully identify and differentiate Mycobacterium tuberculosis subsp. tuberculosis and Mycobacterium tuberculosis bovis. Sensitivity test displayed that the modified LAMP assay with the closed tube system could determine the minimal template concentration of 1 copy/μl, which was more sensitive than that of routine PCR. The advantages of this LAMP method for detection of the Mycobacterium tuberculosis complex included high specificity, high sensitivity, simplicity, and superiority in avoidance of aerosol contamination. The modified LAMP assay would provide a potential for clinical diagnosis and therapy of tuberculosis in the developing countries and the resource-limited areas.

2011年

1、基于转录终点序列特征预测大肠杆菌sRNA

刘倩, 应晓敏, 吴佳瑤, 查磊, 李伍举

生物物理学报,2011,27(3):257-264

摘要:细菌sRNA 是一类长度在40~500 nt 的调控RNA, 在细菌与环境相互作用中发挥重要功能, 因此, 细菌sRNA 识别研究具有重要意义。然而, 与蛋白编码基因具有易于识别的特征不同, 目前细菌sRNA 识别仍是一件比较困难的事。此方法介绍了一个基于已知细菌sRNA 转录终点的碱基频率矩阵来识别sRNA 的预测策略, 并在大肠杆菌K-12 MG1655 中进行了sRNA 的预测。结果表明, 该模型在独立测试集中具有较高的特异性和阳性检出率, 因此, 这一方法将为实验发现细菌sRNA 提供较好的生物信息学支持。

2、Generate gene expression profile from high-throughput sequencing data

Hui LIU, Zhichao JIANG, Xiangzhong FANG, Hanjiang FU, Xiaofei ZHENG, Lei CHA, Wuju LI

Front. Math. China 2011, 6(6): 1131–1145

Abstract: This work presents two methods, the Least-square and Bayesian method, to solve the multiple mapping problem in extracting gene expression profiles through the next-generation sequencing. We parallel the tag sequences to genome, and partition them to improving the methods’ efficiency. The
essential feature of these methods is that they can solve the multiple mapping problem between genes and short-reads, while generating almost the same estimation in single-mapping situation as the traditional approaches. These two methods are compared by simulation and a real example, which was generated from radiation-induced lung cancer cells (A549), through mapping short-reads to human ncRNA database. The results show that the Bayesian method, as realized by Gibbs sampler, is more efficient and robust than the Least-square method.

3、sTarPicker: A Method for Efficient Prediction of Bacterial sRNA Targets Based on a Two-Step Model for Hybridization

Xiaomin Ying, Yuan Cao, Jiayao Wu, Qian Liu, Lei Cha, Wuju Li

PLoS ONE 6(7): e22705.

Abstract

Background: Bacterial sRNAs are a class of small regulatory RNAs involved in regulation of expression of a variety of genes. Most sRNAs act in trans via base-pairing with target mRNAs, leading to repression or activation of translation or mRNA degradation. To date, more than 1,000 sRNAs have been identified. However, direct targets have been identified for only approximately 50 of these sRNAs. Computational predictions can provide candidates for target validation, thereby increasing the speed of sRNA target identification. Although several methods have been developed, target prediction for bacterial sRNAs remains challenging.

Results: Here, we propose a novel method for sRNA target prediction, termed sTarPicker, which was based on a two-step model for hybridization between an sRNA and an mRNA target. This method first selects stable duplexes after screening all possible duplexes between the sRNA and the potential mRNA target. Next, hybridization between the sRNA and the target is extended to span the entire binding site. Finally, quantitative predictions are produced with an ensemble classifier generated using machine-learning methods. In calculations to determine the hybridization energies of seed regions and binding regions, both thermodynamic stability and site accessibility of the sRNAs and targets were considered. Comparisons
with the existing methods showed that sTarPicker performed best in both performance of target prediction and accuracy of the predicted binding sites.

Conclusions: sTarPicker can predict bacterial sRNA targets with higher efficiency and determine the exact locations of the interactions with a higher accuracy than competing programs. sTarPicker is available at http://ccb.bmi.ac.cn/starpicker/.

4、人源microRNA 前体的全基因组预测

应晓敏, 朱娟娟, 王小磊, 赵东升, 付汉江, 郑晓飞, 李伍举

中国科学 生命科学,2011,41(10): 958 ~ 964

摘要:microRNA(miRNA)是一类不编码蛋白的调控小分子RNA, 在真核生物中发挥着广泛而重要的调控功能. 由于miRNA的表达具有时空特异性, 因而通过计算方法预测miRNA而后有针对性的实验验证是miRNA 发现的一条重要途径. 降低假阳性率是miRNA 预测方法面临的重要挑战. 本研究采用集成学习方法构建预测miRNA 前体的分类器SVMbagging, 对训练集、测试集和独立测试集的结果表明, 本研究的方法性能稳健、假阳性率低, 具有很好的泛化能力, 尤其是当阈值取0.9 时, 特异性高达99.90%, 敏感性在26%以上, 适合于全基因组预测. 采用SVMbagging 在人全基因组中预测miRNA 前体, 当取阈值0.9 时, 得到14933 个可能的miRNA前体. 通过与高通量小RNA 测序数据的比较, 发现其中4481 个miRNA 前体具有完全匹配的小RNA 序列, 与理论估计的真阳性数值非常接近. 最后, 对32 个可能的miRNA 进行实验验证, 确定其中2 条为真实的miRNA.

2010年

1、sRNATarBase: A comprehensive database of bacterial sRNA targets verified by experiments

Yuan Cao, Jiayao Wu, Qian Liu, Yalin Zhao, Xiaomin Ying, Lei Cha, Ligui Wang, and Wuju Li

RNA (2010), 16:2051–2057

ABSTRACT:Bacterial sRNAs are an emerging class of small regulatory RNAs, 40–500 nt in length, which play a variety of important roles in many biological processes through binding to their mRNA or protein targets. A comprehensive database of experimentally confirmed sRNA targets would be helpful in understanding sRNA functions systematically and provide support for developing prediction models. Here we report on such a database—sRNATarBase. The database holds 138 sRNA–target interactions and 252 noninteraction entries, which were manually collected from peer-reviewed papers. The detailed information for each entry, such as supporting experimental protocols, BLAST-based phylogenetic analysis of sRNA–mRNA target interaction in closely related bacteria, predicted secondary structures for both sRNAs and their targets, and available binding regions, is provided as accurately as possible. This database also provides hyperlinks to other databases including GenBank, SWISS-PROT, and MPIDB. The database is available from the web page http://ccb.bmi.ac.cn/srnatarbase/.

2、RT-LAMP检测甲型H1N1病毒核酸几种结果判定方法

付文亮 蔡欣 查磊 高杰 洪明 邹民吉 李伍举 徐东刚 吴奎武

医学研究杂志,2010,39(5):68-70

摘要:目的:比较RT- LAMP检测甲型H 1N1流感病毒核酸几种不同的结果判定方法的差异, 优化检测方法。方法 :利用已知的阳性病毒样本, 对比电泳、直接观察、加入SYBR GREEN Ñ 核酸染料和优化后的预染核酸染料和的检测灵敏度。结果:电泳检测的灵敏度最高, 加入SYBR GREEN Ñ 染料的灵敏度略低, 而直接的肉眼观察灵敏度低约2个数量级。加入优化后的预染染料其检测灵敏度与加入SYBR GREEN Ñ 核酸染料相当。结论 电泳法检测的灵敏度最高, 可通过加入优化后的预染染料提
高目测判定反应结果的灵敏度, 增强反应的特异性, 并可降低气溶胶污染。

3、环介导等温扩增法快速检测结核杆菌复合群内主要致病菌

洪明, 查磊, 付文亮, 邹民吉, 李伍举, 吴奎武, 徐东刚

医学研究杂志,2010,39(11):60-63

摘要: 目的:快速诊断和区分结核杆菌复合群内两种主要致病菌。方法:利用环介导等温扩增( LAM P) 技术建立快速检测和区分结核杆菌复合群内主要致病菌的方法, 利用该方法对相关的临床分离株样本进行特异性检测, 并利用1:10 倍比稀释的已知菌株DNA 模板分析其敏感性。反应结束后通过电泳或向反应管中加入DNA染料肉眼判定检测结果。结果:该方法可以成功检测到主要的结核致病菌: 人型和牛型结核分枝杆菌, 与包括卡介苗在内的其余相关菌株均未见非特异性交叉反应。检测的灵敏度可达100 拷贝/微升, 高于常规PCR方法。结论:该检测方法具有敏感、特异、低成本和快速的特点, 可检测和区分人型和牛型结核分枝杆菌, 并能排除卡介苗对诊断的干扰。
 

2009年

1、BioSunMS: a plug-in-based software for the management of patients information and the analysis of peptide profiles from mass spectrometry

Yuan Cao, Na Wang, Xiaomin Ying, Ailing Li, Hengsha Wang, Xuemin Zhang and Wuju Li

BMC Medical Informatics and Decision Making 2009, 9:13doi:10.1186/1472-6947-9-13

Abstract:
Background
With wide applications of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) and surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS), statistical comparison of serum peptide profiles and management of patients information play an important role in clinical studies, such as early diagnosis, personalized medicine and biomarker discovery. However, current available software tools mainly focused on data analysis rather than providing a flexible platform for both the management of patients information and mass spectrometry (MS) data analysis.
Results
Here we presented a plug-in-based software, BioSunMS, for both the management of patients information and serum peptide profiles-based statistical analysis. By integrating all functions into a user-friendly desktop application, BioSunMS provided a comprehensive solution for clinical researchers without any knowledge in programming, as well as a plug-in architecture platform with the possibility for developers to add or modify functions without need to recompile the entire application.
Conclusion
BioSunMS provides a plug-in-based solution for managing, analyzing, and sharing high volumes of MALDI-TOF or SELDI-TOF MS data. The software is freely distributed under GNU General Public License (GPL) and can be downloaded from http://sourceforge.net/projects/biosunms/

Full Text Download:

 

2、sRNATarget: a web server for prediction of bacterial sRNA targets

Yuan Cao, Yalin Zhao, Lei Cha, Xiaomin Ying, Ligui Wang, Ningsheng Shao, Wuju Li*

Bioinformation 3(8): 364-366 (2009)

Abstract:In bacteria, there exist some small non-coding RNAs (sRNAs) with 40–500 nucleotides in length. Most of them function as post-transcriptional regulation of gene expression through binding to their target mRNAs, in which Hfq protein acts as RNA chaper-one. With the increase of identified sRNA genes in the bacterium, prediction of sRNA targets plays a more important role in determining sRNA functions. However, there are few available computational tools for predicting sRNA targets at present. Here we introduced a web server, sRNATarget, for genome-scale prediction of bacterial sRNA targets. The server is based on a re-cently published model which uses Naive Bayes method as the supervised method and take RNA secondary structure profile as the feature. The prediction results will be returned to the users through E-mail.

Full Text Download:

 

3、BatchGenAna: a batch platform for large-scale genomic analysis of mammalian small RNAs

Xiaomin Ying, You Jung Kim, Yiqing Mao, Ming Liu, Yanyan Hou, Hua Li, Xiaolei Wang, Yalin Zhao, Dongsheng Zhao, Jignesh M. Patel, Wuju Li*

Bioinformation 3(8): 346-348 (2009)

Abstract:An increasing number of small RNAs have been discovered in mammals. However, their primary transcripts and upstream regulatory networks remain largely to be determined. Genomic analysis of small RNAs facilitates identification of their primary transcripts, and hence contributes to researches of their upstream regulatory networks. We here report a batch platform, BatchGenAna, which is specifically designed for large-scale genomic analysis of mammalian small RNAs. It can map and annotate for as many as 1000 small RNAs or 10,000 genomic loci of small RNAs at a time. It provides genomic features including RefSeq genes, mRNAs, ESTs and repeat elements in tabular and graphical results. It also allows extracting flanking sequences of submitted queries, specified genomic regions and host transcripts, which facilitates subsequent analysis such as scanning transcription factor binding sites in upstream sequences and poly(A) signals in downstream sequences. Besides small RNA fields, BatchGenAna can also be applied to other research fields, e.g. in silico analysis of target genes of transcription factors.

Full Text Download:

 

4、细菌sRNA基因及其靶标预测研究进展

王立贵,赵雅琳,李伍举

微生物学报

摘要:细菌sRNA是一类长度在40~500 nt之间的非编码RNA,主要以不完全碱基配对方式与靶标mRNA5′端相互作用进而发挥其生物学功能。鉴于预测方法可以为细菌sRNA及其靶标的实验发现提供指导,因此,细菌sRNA与靶标预测研究受到了广泛重视。文章首先将sRNA预测方法分为3类,分别是基于比较基因组学的预测方法、基于转录单元的预测方法和基于机器学习的预测方法;其次,将sRNA靶标预测方法分为2类,分别是序列比较方法与基于RNA二级结构的预测方法;最后对各类方法的原理、核心思想、优点和局限性进行了分析,并探讨了进一步的发展方向。

Full Text Download:

 

5、pBV220载体中外源基因高效表达的自动化设计

查磊,应晓敏,王立贵,曹源,骆志刚,苑波*,李伍举*

生物化学与分子生物学学报

摘要:pBV220载体是国内科学家构建的原核系统表达载体,目前仍在广泛应用.但是,实现外源基因的高效表达需要综合考虑诸如RNA二级结构等多种因素,极其耗时费力.为此,基于我们提出的pBV220载体中外源基因高效表达数学模型,编写了外源基因高效表达的自动化设计软件,并可定性用于原核系统其它载体中外源基因表达水平分析,最终为加快实验进程提供帮助.

Full Text Download:

 

6、计算机辅助分子生物学实验设计与分析

主 编:李伍举

出 版 社:军事医学科学出版社 ISBN:9787802450752

内容简介:《计算机辅助分子生物学实验设计与分析》为生物医学实验技术系列丛书分册之一。全书包含5篇共26章和4个附录,内容涉及PCR实验设计、RNA二级结构预测、核酶设计、反义核酸设计、siRNA设计、pBV220载体中外源基因高效表达设计、pPIC9载体中外源基因高效表达设计、B细胞抗原表位预测、T细胞抗原表位预测、蛋白质三级结构预测与显示、蛋白质功能位点分析、寡核苷酸芯片探针设计、基于基因表达谱的差异基因识别、基于基因表达谱的样本分类、基于基因表达谱的样本聚类、利用Perl和Bioperl进行生物信息学分析和利用MatLab进行生物信息学分析等,对加快实验进程和提高实验的成功率具有一定帮助。

 

 

2008年

1.Construction of two mathematical models for prediction of bacterial sRNA targets

Yalin Zhao, Hua Li, Yanyan Hou, Lei Cha, Yuan Cao, Ligui Wang, Xiaomin Ying, Wuju Li

Biochemical and Biophysical Research Communications,2008, 372:346–350

 

Abstract: Accurate prediction of sRNA targets plays a key role in determining sRNA functions. Here we introduced two mathematical models, sRNATargetNB and sRNATargetSVM, for prediction of sRNA targets using Naı¨ve Bayes method and support vector machines (SVM), respectively. The training dataset was composed of 46 positive samples (real sRNA–targets interaction) and 86 negative samples (no interaction between sRNA and targets). The leave-one-out cross-validation (LOOCV) classification accuracy was 91.67% for sRNATargetNB, and 100.00% for sRNATargetSVM. To evaluate the performance of the models, an independent test dataset was used, which contained 22 positive samples and 1700 randomly generated negative samples. The results showed that the classification accuracy, sensitivity, and specificity were 93.03%, 40.90%, and 93.71% for sRNATargetNB and 80.55%, 72.73%, and 80.65% for sRNATargetSVM,respectively. Therefore, the presented models provide support for experimental identification of sRNA targets.The related software and supplementary materials can be downloaded from webpage www.biosun.org.cn/srnatarget/.

 

Full Text Download:

 

 

 

2.Identification and verification of microRNA in wheat

 

Weibo Jin, Nannan Li, Bin Zhang, Fangli Wu, Wuju Li, Aiguang Guo, Zhiyong Deng

J Plant Res, 2008, 121:351–355

 

Abstract: MicroRNAs (miRNAs) are small, endogenous RNAs that regulate gene expression in both plants and animals. A large number of miRNAs has been identified from various animals and model plant species such as Arabidopsis thaliana and rice (Oryza sativa); however, characteristics of wheat (Triticum aestivum) miRNAs are poorly understood. Here, computational identification of miRNAs from wheat EST sequences was preformed by using the in-house program GenomicSVM, a prediction model for miRNAs. This study resulted in the discovery of 79 miRNA candidates. Nine out of 22 miRNA representatives randomly selected from the 79 candidates were experimentally validated with Northern blotting, indicating that prediction accuracy is about 40%. For the 9 validated miRNAs, 59 wheat ESTs were predicted as their putative targets.

 

Full Text Download:

 

 

 

3.基于SVM方法构建细菌sRNA靶标预测模型

 

赵雅琳,李华,侯妍妍,查磊,曹源,王立贵,应晓敏,李伍举

军事医学科学院院刊,2008,32:375-378

 

[摘要 目的:为实验方法鉴定sRNA靶标和研究sRNA功能提供生物信息学支持。方法:首先以实验证实的132sRNA与靶标相互作用数据为训练集,其中包含46个阳性数据和86个阴性数据;其次,以实验证实的22个阳性数据和随机生成的1700个阴性数据为测试集;最后以RNA二级结构谱等特征为变量,运用支持向量机方法构建sRNA靶标预测数学模型。结果和结论:构建的模型对训练集的敏感性和特异性均为100%,对测试集的敏感性和特异性分别为72.73%80.65%。所构建的数学模型为实验发现sRNA靶标提供了生物信息学支持。

 

Full Text Download:

 

 

 

4.MiRscreen:一种基于遗传算法和支持向量机的microRNA前体识别方法

 

侯妍妍,李华,应晓敏,李伍举

军事医学科学院院刊,2008,32:287-292

 

[摘要] 目的:构建具有高敏感性和高特异性的microRNA前体(p re2miRNA)识别模型。方法:根据300例经实验验证的人p re2miRNA和300例从3′UTR折成茎环结构的片段中随机选取的阴性样本,基于支持向量机方法构建了区分p re2miRNA和p seudo p re2miRNA的分类器MiRscreen。为提高分类器的性能,我们采用遗传算法搜索影响分类器性能的2个重要参数C和γ。结果与结论:该分类器对训练集的敏感性为99. 33% ,特异性为100% ,对剩余的91例人p re2miRNA和91例3′UTR中的p seudo p re2miRNA敏感性和特异性分别达到91. 21% ( 83 /91) 和93. 41%(85 /91) 。在除人以外的其他20种动物和病毒的1 353例p re2miRNA中,MiRscreen正确判断出其中的1 192例,敏感性达到88. 10% ,其中马雷克病病毒、猕猴淋巴隐病毒、EB病毒、猿猴病毒40、非洲爪蟾、狗、绵羊和猕猴共计8个物种的敏感性达到100%;在随机抽取的100条RefSeq基因折叠形成的556例p seudo p re2miRNA和随机抽取的797例人19号染色体折叠形成的p seudo p re2miRNA (共计1 353例混合阴性样本)中,MiRscreen的特异性达到85. 14%(1 152 /1 353) 。与其他6种同类方法相比,MiRscreen在敏感性和特异性方面均具有较好的性能,分类精度最高,
达到86. 62% ,比其他方法高6%以上; MiRscreen的AUC值达到0. 938,也明显高于其他方法。
[关键词]  微RNAs;识别;遗传算法;支持向量机

 

Full Text Download:

 

 

 

 

5.microRNA计算发现方法的研究进展

 

侯妍妍,应晓敏,李伍举

遗传,2008,30:687-696

 

摘要: microRNA (miRNA)是近几年发现的一类长度为~21 nt 的内源非编码小RNA, 在植物和动物中发挥着重要而广泛的调控功能。它的发现主要有cDNA 克隆测序和计算发现两条途径。由于cDNA 克隆测序方法受miRNA 表达的时间和组织特异性以及表达水平的影响, 而计算发现可以弥补其不足, 因此miRNA 的计算发现方法研究受到了广泛的重视。文章对近几年计算发现miRNA 的研究进展进行了综述, 根据计算发现方法的本质, 将计算发现方法归纳为5 类, 分别是同源片段搜索方法、基于比较基因组学的预测方法、基于序列和结构特征打分的预测方法、结合作用靶标的预测方法和基于机器学习的预测方法, 并对各类方法的原理、核心思想、优点和局限性进行了分析, 最后探讨了进一步的发展方向。
关键词: microRNA; 计算发现; 同源搜索; 比较基因组学; 作用靶标; 机器学习

 

Full Text Download:

 

 

 

 

6.食管癌相关功能未知基因的电子克隆延伸与ncRNA的发现

 

 吴炳礼,许丽艳,应晓敏,牛永东,李伍举,李恩民

 癌变 畸变 突变, 2008,20:85-88

 

 【摘要】背景与目的:运用电子克隆等生物信息学方法研究筛查出的48 个与食管癌相关功能未知的DNA 序列片段,为食管癌相关研究提供指导。材料与方法:以48个DNA序列片段为核心,运用BioEdit建立本地数据库;通过电子克隆的方法对48个WRU 序列中功能未知的基因片段进行序列延伸;通过blast同源分析搜索48个基因的内含子以及上下游基因间隔区中存在的非编码ncRNA。结果:48个 DNA序列中功能未知的基因片段通过电子克隆的方法平均能够延伸190bp以上;在48个基因的内含子以及上下游基因间隔区存在着与已知ncRNA相似性 很高的片段。结论:运用电子克隆的方法可以使某些食管癌相关功能未知基因的序列得以明显延伸;一些食管癌相关基因所在的染色体区段存在着某些与ncRNA 高度相似的片段,这提示我们,ncRNA可能参与食管癌的发生过程,其具体功能有待深入研究。

 

Full Text Download:

 

 

7.口蹄疫病毒分型诊断芯片探针设计初报

 

 相 磊,陈小玲,李伍举, 梁之昶,章振华,王学文,胡国良,徐福洲,石 岗

 动物医学进展,2008,29:1-5

 

摘 要:为建立口蹄疫病毒( Foot2and2mouth disease virus ,FMDV) 不同血清型与基因型的基因芯片检
测方法,设计针对O 型8 个基因型、A 型3 个基因型和亚洲1 型的特异性探针。从美国GenBank 与英国世界口蹄疫参考实验室基因库下载了O 型、A 型和亚洲1 型FMDV 的VP1 基因序列547 条。对每一血清型序列用DNA Star 软件ClastalW 程序进行多重比对,做系统发育分析并进行基因分型。用生物学软件BioSun 2. 0 建立基因型数据库,设计每一基因型的特异性探针。共设计出104 条候选探针,通过芯片试验筛选出12 条特异性探针。以各型特异性探针所对应的靶序列模板做10 倍系列稀释进行PCR 扩增,扩增产物与探针杂交,验证各探针的灵敏度。对O 型SEA、Euro2SA、ME2SA、WA 4 个基因型的各条探针的灵敏度进行了检验,结果这些探针能够检测到102 数量级拷贝数的阳性靶标。
关键词:口蹄疫病毒;血清型;基因型;VP1 基因;基因芯片;寡核苷酸探针

 

Full Text Download:

 

 

 

2007年

1、Construction of mathematical model for high-level expression of foreign genes in pPIC9 vector and its verification

Bingli Wu, Lei Cha, Zepeng Du, Xiaomin Ying, Hua Li, Liyan Xu, Xiaofei Zheng, Enmin Li, Wuju Li

Biochemical and Biophysical Research Communications,2007, 354:498–504  

Abstract: In this report, we introduced a mathematical model for high-level expression of foreign genes in pPIC9 vector. At first, we collected 40 heterologous genes expressed in pPIC9 vector, and these 40 genes were classified into high-level expression group (expression level >100mg/L, 12 genes) and low-level expression group (expression level <100mg/L, 28 genes). Then, the Naive Bayes method was used to construct the model with RNA secondary structure profile of 3'-end of foreign genes as features. The classification accuracy from leave-one-out cross-validation was 100%. Finally, another five genes collected from literatures were used to test the ability of the model. The results indicated that there were four genes correctly predicted. In addition, the model was also verified by expressing human neutrophil gelatinase-associated lipocalin (NGAL) gene with expression level more than 100mg/L. Therefore, we propose that the model can be used to predict the expression level of heterologous genes before experiments and optimize the experiment designs to obtain the high-level expression. Furthermore, we have developed a web server for evaluation and design for high-level expression of foreign genes, which is accessible at http://ppic9.med.stu.edu.cn/ppic9

Full Text Download:

 

2、Predicting siRNA efficiency

W. Li and L. Cha

Cell. Mol. Life Sci., 2007, 64:1785 – 1792

 

Abstract:Since the identification of RNA-mediated interference (RNAi) in 1998, RNAi has become an effective tool to inhibit gene expression. The inhibition mechanism is triggered by introducing a short interference double-stranded RNA (siRNA,19~27 bp) into the cytoplasm, where the guide strand of siRNA (usually antisense strand) binds to its target messenger RNA and the expression of the target gene is blocked. RNAi has been widely applied in gene functional analysis, and as a potential therapeutic strategy in viral diseases, drug target discovery, and cancer therapy. Among the factors which may compromise inhibition efficiency, how to design siRNAs with high efficiency and high specificity to its target gene is critical. Although many algorithms have been developed for this purpose, it is still difficult to design such siRNAs. In this review, we will briefly discuss prediction methods for siRNA efficiency and the problems of present approaches.

Full Text Download:

 

3、拟南芥基因组中新的microRNA预测及分析  

金伟波,孔栋,应晓敏,郭爱光,李伍举

生物物理学报,23(2007)389-396

摘要:MicroRNA(miRNA) 是一类存在于动植物体内,长度为21~25nt的内源性小RNA,对生物体的转录后基因调控起着关键作用,但一些低丰度的miRNA和组织特异性 miRNA往往很难发现.为了系统识别拟南芥基因组中新的非同源miRNA,首先基于已报道的拟南芥miRNA的特征,从全基因组范围中筛选出453条可能的miRNA前体:其次,为了进一步对上述miRNA前体进行筛选,利用人的miRNA前体数据构建了支持向量机模型GenomicSVM,该模型对人测试集的敏感性和特异性分别为86.3﹪和98.1﹪(30个人miRNA前体和1 000个阴性miRNA前体),对拟南芥测试集的正确率为93.6﹪(78个miRNA前体);最后,利用GenomicSVM预测上述453条 miRNA前体序列,得到了37条候选的新的拟南芥miRNA前体,为进一步的miRNA实验发现研究提供了指导.

Full Text Download:  

 

2006年

1、基于k-tuple组合酵母ncRNA与mRNA的比较研究

李华、应晓敏、查磊、李伍举

生物物理学报,2006,22:110-116  

摘要:ncRNA 和mRNA一样,都是重要的功能分子。以k-tuple(k字)含量为特征,对酵母ncRNA成熟序列和mRNA的编码区、上游序列与下游序列进行了分类与比较研究,结果显示:基于ncRNA成熟序列与mRNA编码区的3-tuple的含量,ncRNA和mRNA的交叉有效性分类精度(leave-one out cross—validation,LOOCV)平均值达到93.93%;基于上游序列4-tuple和5-tuple的含量,分类精度分别为 92.49%和92.76%;基于下游序列4-tuple和5-tuple的含量,分类精度分别为91.58%和90.60%;利用上游序列和下游序列的 4-tuple与5-tuple的含量,其平均分类精度分别为94.68%和94,83%;通过t检验,得到了在ncRNA和mRNA上、下游序列中具有显著统计学差异的k-tuple。上述结果表明,基于ncRNA成熟序列与mRNA编码区的3-tuple含量和基于ncRNA与mRNA上、下游序列的 4或5-tuple含量可以有效地区分ncRNA与mRNA。此研究结果不仅有助于准确识别ncRNA与mRNA,还有助于发现ncRNA特异的转录因子结合位点。

 

Full Text Download:

 

2.BioSun2.0:一个综合性的辅助分子生物学实验设计软件

查磊, 应晓敏, 曹源, 李华, 李伍举

军事医学科学院院刊,2006,30:461-464

摘要:我们曾于2004年推出了计算机辅助分子生物学实验设计的软件系统BioSun 1.0,该系统提供了较为全面的数据处理与分析功能.为了更好地服务于生物医学工作者,我们对该软件系统进行了升级,推出了2.0版本,新增的功能主要有:基于Blast的多种形式的序列比对、基于ClustalW的多序列比对与进化树构建、蛋白质三维结构展示、基于RNAfold的RNA二级结构预测和序列格式转换等.通过与商业化综合性的生物信息学软件系统DNASIS MAX 2.05、DNAStar 5.0、Vector NTI 9.1和BioEdit 7.0 的比较发现,BioSun2.0具有操作简便、功能众多和性价比高等特点,能够满足生物医学实验室的常规需求

Full Text Download:

 

3.Mprobe 2.0:Computer-Aided Probe Design for Oligonucleotide Microarray  

 Wuju Li, Xiaomin Ying  

 Applied bioinformatics, 2006, 5:181-186

Abstract: DNA chips have proven to be effective tools in detecting gene expression levels. Compared with DNA chips using complementary DNA as probes, oligonucleotide microarrays using oligonucleotides as probes have attracted great attention because of their well known advantages. The design of gene-specific probes for each target is essential to the development of oligonucleotide microarrays. We have previously reported the development of a probe design software termed Mprobe 1.0. Here, we present a new version of this software, termed Mprobe 2.0. Several new features are included in Mprobe 2.0. Firstly, a paradox-based sequence database management system has been developed and integrated into the software, which consequently allows interoperability with sequences in GenBank, EMBL, and FASTA formats. Secondly, in contrast to setting a fixed threshold for the secondary structure of probes in Mprobe 1.0 and other related software, Mprobe 2.0 employs a different method. After parameters such as GC type, probe melting temperature and GC contents have been evaluated, candidate probes are sorted by the free energy from high to low value, followed by specificity analysis. Thirdly, Mprobe 2.0 provides users with substantial parameter options in the visual mode. Mprobe 2.0 possesses an easier interface for users to manage sequences annotated in different formats and design the optimal probes for oligonucleotide microarrays and other applications. AVAILABILITY: The program is free for non-commercial users and can be downloaded from the web page

Full Text Download:

 

2005年

1.How many genes are needed for early detection of breast cancer, based on gene expression patterns in peripheral blood cells?

Wuju Li

Breast Cancer Research, 2005, vol. 7 (5): E5.  

Abstract: In their recent report [1], Sharma and coworkers explore the early detection of breast cancer. They analyzed a gene expression data set (1368 genes in 62 normal and 40 tumour samples, including sample duplication in different batches) using the nearest shrunken centroid method. They identified a panel of 37 genes that permitted early detection, with the classification accuracy being about 82%. This is a typical problem with sample classification based on gene expression profiling. The objective is to achieve high prediction accuracy with as few genes as possible, and so feature selection plays an important role; examination of a large number of genes will increase the dimensionality, computational complexity, and clinical cost. According to our previous study of data sets from patients with colon cancer, leukaemia and breast cancer [2], we estimated that five or six genes – rather than 37 -would be sufficient for the early detection of beast cancer [1]. So how many genes are indeed needed? In order to address this question, we evaluated the data presented by Sharma and coworkers using the Tclass system [2].

In the Tclass system, Fisher's linear discriminant analysis and a step-wise optimization procedure for feature selection are used to analyze a batch adjusted data set [1] in two ways. The first is to take the prediction accuracy from the training set as the object function. The second way is to take the classification accuracy from the leave-one-out cross-validation as the object function. For the former, the selected optimal feature sets are evaluated by randomly dividing all tissue samples into a training set (e.g. 50%, 67%, or 85% of samples) and a test set 200 times. The relationship between the prediction accuracy and the number of genes is illustrated in Fig. 1, which shows that the greatest prediction accuracy was achieved using six genes (Fig. 1a); other peaks in accuracy occurred when 10, 13, or 15 genes were used (Fig. 1b). Furthermore, two genes – the 481th (BC009696) and the 801th (BC000514) – permitted classification accuracy as high as 86%, which is greater than the 82% achieved by Sharma and coworkers [1] with the selected 37 genes.

 

Full Text Download:

 

 

2.An approach to studying lung cancer-related proteins in human blood

Ting Xiao, Wantao Ying, Lei Li, Zhi Hu, Ying Ma, Liyan Jiao, Jinfang Ma, Yun Cai, Dongmei Lin, Suping Guo, Naijun Han, Xuebing Di, Min Li, Dechao Zhang, Kai Su, Jinsong Yuan, Hongwei Zheng, Meixia Gao, Jie He, Susheng Shi, Wuju Li, Ningzhi Xu, Husheng Zhang, Yan Liu, Kaitai Zhang, yanning Gao, Xiaohong Qian, and Shujun Cheng

Molecular & Cellular Proteomics, 2005, published online.

Abstract: Early-stage lung cancer detection is the first step towards successful clinical therapy and increased patient survival. Clinicians monitor cancer progression by profiling tumor cell proteins in the blood plasma of afflicted patients. Blood plasma, however, is a difficult cancer protein assessment media, because it is rich in albumins and heterogeneous protein species. We report herein a method to detect the proteins released into the circulatory system by tumor cells. Initially, we analyzed the protein components in the conditional medium (CM) of lung cancer primary cell or organ cultures, and in the adjacent normal bronchus using 1-D PAGE and nano-ESI-MS/MS. We identified 299 proteins involved in key cellular process such as cell growth, organogenesis and signal transduction. We selected 13 interesting proteins from this list, and analyzed them in 628 blood plasma samples using ELISA. We detected 11 of these 13 proteins in the plasma of lung cancer patients and non-patient controls. Our results showed that plasma MMP1 levels were elevated significantly in late-stage lung cancer patients, and that the plasma levels of 14-3-3 sigma, beta and eta in the lung cancer patients were significantly lower than those in the control subjects. To our knowledge, this is the first time that fascin, ezrin, CD98, annexin A4, 14-3-3 sigma, 14-3-3 beta and 14-3-3 eta proteins have been detected in human plasma by ELISA. The preliminary results showed that a combination of CD98, fascin, PIGR/SC and 14-3-3 eta had a higher sensitivity and specificity than any single marker. In conclusion, we report a method to detect proteins released into blood by lung cancer. This pilot approach may lead to the identification of novel protein markers in blood and provide a new method of identifying tumor biomarker profiles for guiding both early detection and therapy of human cancer.

Full Text Download:

 

2004年

Wuju Li, Tao Liu, Xiaomin Ying, and Ming Fa

Molecular & Cellular Proteomics, 2004, vol.3 (10): S79.

Abstract: With genomic sequences from three domains of life become increasingly available, the relationships between the AAC and the genome classes (organisms' phenotype) have been widely studied in the following two aspects. The first aspect is to concentrate on the difference of AAC of proteins from particular type or whole proteomes in different genome classes. The second aspect is to study the issue of genome class prediction based on the AAC. The purpose of the above two aspects is to explain why certain organisms can live in extreme conditions of temperature, salinity, or pressure. Here we want to emphasize whether there is a possibility to predict the genome classes as accurately as possible using small subsets of amino acids. In order to investigate the issues systematically, the Fisher linear discriminate analysis (FLDA) was applied to the following four data sets DOMAIN, LIFE, HTHAB, and ARCHAEA. The DOMAIN is about the three domains of life (16 archaea, 75 bacteria, and 6 eukaryotic genomes). The LIFE is about the three lifestyles (13 HTH, 4 TH, and 79 MES). The HTHAB includes 10 HTH in archaea and 3 HTH in bacteria. The ARCHAEA is about the three lifestyles in archaea (10 HTH, 3 TH, and 3 MES). By using the feature selection method of all possible combinations of features (amino acids), we found that the cross-validation accuracies for above four data sets could reach 94.8%, 97.9%, 100.0%, and 100.0% by only using the compositions of four (A, I, K, and Q), five (I, K, P, V, and Y), two (E and Q), and two (M and Q) amino acids respectively. The average cross-validation accuracy reaches 98.2%. Therefore, AAC from the proteomes provides an alternative way to determine the genome classes such as the lifestyle or the domains of life. According to what we know, the correspondence analysis, principal component analysis (PCA), and hierarchical cluster analysis have been applied to study the distinction of different genome classes using the AAC, but the classification methods have not been used. Therefore, our work represents a first attempt on this effort in this field.

PDF Abstract Download:

   

Xiaomin Ying, Hong Luo, Jingchu Luo and Wuju Li

Nucleic Acids Research, 2004, vol.32: W150-W153.

Abstract: Prediction of RNA secondary structure is important in the functional analysis of RNA molecules. The RDfolder web server described in this paper provides two methods for prediction of RNA secondary structure: random stacking of helical regions and helical regions distribution. The random stacking method predicts secondary structure by Monte Carlo simulations. The method of helical regions distribution predicts secondary structure based on the helices that appear most frequently in the set of structures, which are generated by the random stacking method. The RDfolder web server can be accessed at http://rna.cbi.pku.edu.cn.

Full Text Download:

 

 

3、BioSun:计算机辅助分子生物学实验设计的软件系统

李伍举, 应晓敏

军事医学科学院院刊2004 vol. 28(5): 401-404

摘要:论述了我们自行研究与开发的分子生物学实验辅助设计的生物信息学软件系统BioSun,运行于Windows环境。其主要功能有:可视化的序列编辑、可接收多种序列格式(EMBL, GenBank和FastA)的数据库管理系统、多种方式的序列比较、多种方式的抗原表位预测、基于多种算法的RNA二级结构预测、酶切位点分析及酶切图谱制作、PCR实验辅助设计、辅助寡核苷酸微阵列的探针设计、辅助cDNA微阵列的引物设计和原核系统外源基因高效表达设计等。BioSun系统使用图形用户界面方式,可实现对图形与文本文件的灵活管理,具有操作灵活、功能多样等特点,可用于分子生物学实验辅助设计,对加快实验进程和提高实验的成功率具有较大意义。

 

 

2003年

Wuju Li, Ming Fan and Momiao Xiong

Bioinformatics, 2003, vol.19: 811-817

Motivation: Feature (gene) selection can dramatically improve the accuracy of gene expression profile based sample class prediction. Many statistical methods for feature (gene) selection such as stepwise optimization and Monte Carlo simulation have been developed for tissue sample classification. In contrast to class prediction, few statistical and computational methods for feature selection have been applied to clustering algorithms for pattern discovery.
Results: An integrated scheme and corresponding program SamCluster for automatic discovery of sample classes based on gene expression profile is presented in this report. The scheme incorporates the feature selection algorithms based on the calculation of CV (coefficient of variation) and t-test into hierarchical clustering and proceeds as follows. At first, the genes with their CV greater than the pre-specified threshold are selected for cluster analysis, which results in two putative sample classes. Then, significantly differentially expressed genes in the two putative sample classes with p-values 0.01, 0.05, or 0.1 from t-test are selected for further cluster analysis. The above processes were iterated until the two stable sample classes were found. Finally, the consensus sample classes are constructed from the putative classes that are derived from the different CV thresholds, and the best putative sample classes that have the minimum distance between the consensus classes and the putative classes are identified. To evaluate the performance of the feature selection for cluster analysis, the proposed scheme was applied to four expression datasets COLON, LEUKEMIA72, LEUKEMIA38, and OVARIAN. The results show that there are only 5, 1, 0, and 0 samples that have been misclassified, respectively. We conclude that the proposed scheme, SamCluster, is an efficient method for discovery of sample classes using gene expression profile.
Availability: The related program SamCluster is available upon request or from the web page http://www.sph.uth.tmc.edu:8052/hgc/Downloads.asp
or http://www.biosun.com.cn/softwares/samcluater.html

 

Full Text Download:

 

 

2.SARS病毒抗原表位预测

李伍举. 刘涛.

解放军医学杂志 2003 vol.28(6):S9-S10

摘要:[目的] 采用集Hopp&Woods亲水性、Janin表面可及性、Karplus-Schulz主链柔软性和电荷分布为一体的综合性抗原表位预测方法和蛋白质二级结构预测对SARS病毒的两个膜蛋白S和M进行抗原表位预测,以便为SARS病毒的疫苗设计提供依据。[结果]通过运用Goldkey等软件分析了SARS病毒的两个膜蛋白S和M的抗原表位,分别获得了14个和7个可能的抗原表位。

备注:Goldkey的相关功能已集成至我们最新推出的软件BioSun中。  

Full Text Download:    

 

3.传染性非典型肺炎可能病原——新冠状病毒的系统发生学分析

刘涛. 李伍举. 范明.

解放军医学杂志 2003 vol.28(6):S1-S5

摘要:2003 年3月以来,一种新冠状病毒(SARS-CoV)被初步确定为2002年底爆发的致死性传染病——严重急性呼吸综合症(Severe Acute Respiratory Syndrome,即SARS)的病原。该病毒具有其他已知冠状病毒典型的基因组结构。对该病毒进行系统发生学分析对进一步的实验研究具有指导意义。我们首先通过构建SARS-CoV在全基因组水平上的系统发生树来明确其演化位置,然后分别从核酸和蛋白两个水平分析了SARS-CoV的5个主要同源蛋白 ——复制酶、S蛋白、E蛋白、M蛋白和N蛋白的系统发生树。结果表明,SARS-CoV与目前已知的冠状病毒同源,但具有与其它冠状病毒明显不同的特点 ——各同源基因的演化历史彼此不同,其中结构蛋白基因的演化历史与基因组的演化历史不同;SARS-CoV与IBV和TGV尤其是IBV的亲缘关系较近,尤其是在E蛋白和M蛋白两水平上的特殊近缘关系在进一步的实验研究中值得注意和参考。

Full Text Download:  

 

4.人NMDA受体主亚基M3-M4环基因片段的高效表达、纯化与鉴定

张玉梅. 孙长凯. 范明. 李伍举. 刘淑红. 赵杰. 韩大跃. 王嘉玺.

中国生物化学与分子生物学报 2003 vol.19(5):588-593

摘要:用基因工程方法获得人N甲基D天冬氨酸(N methyl D aspartate, NMDA)受体主亚基M3 M4环靶片段,以此为免疫原,用于进一步免疫原性及相关应用研究.自人脑胶质瘤组织中提取总RNA ,采用RT PCR扩增出人NMDA受体主亚基M3 M4环的基因片段,并按照计算机辅助原核表达载体pBV220中外源基因高效表达的数学模型预测方法,将其进行优化改构.将目的基因克隆到pBV2 2 0中,转化大肠杆菌DH5α,升温诱导表达,从蛋白质水平检测重组体在大肠杆菌中的表达情况,通过制备性SDS PAGE进行纯化,从相对分子质量、免疫反应性、肽质谱指纹分析等方面进行鉴定.结果表明,成功构建了人NMDA受体主亚基M3 M4环的原核表达载体(命名为pBV NR1L3) ,通过基因优化,实现了高效表达.凝胶扫描分析表达量约占菌体总蛋白29% ,重组肽纯度达95%以上。

 

 

2002年

Li Wuju and Xiong Momiao

Bioinformatics 2002, vol.18: 325-326

Summary: A method that incorporates feature selection into Fisher’s linear discriminant analysis for gene expression based tumor classification and a corresponding program Tclass were developed. The proposed method was applied to a public gene expression data set for colon cancer that consists of 22 normal and 40 tumor colon tissue samples to evaluate its performance for classification. Preliminary results demonstrated that using only a subset of genes ranging from 3 to 10 can achieve high classification accuracy.
Availability: The program is written in Matlab and is being rewritten in the Java language. The source code is available upon request.

Full Text Download:

 

 

Wuju Li, Jian Huang, Ming Fan, Shengqi Wang

Applied Bioinformatics 2002:1(3):163-166.

Abstract: The present work describes a complete probe design software system for oligonucleotide microarrays based on Kane’s research on probe sensitivity and specificity (Kane’s rule). Combining Kane’s rule and traditional criteria for probe design we constructed MProbe, the software system for oligonucleotide microarrays using Java. The general criteria for probe design are: (1) probes may have different lengths that range from 20 to 100 bases; (2) they should have a similar melting temperature (Tm) or GC content; (3) they should not contain stable secondary structures; and (4) they abide by Kane’s rule.

 

 

3.基因表达谱的生物信息学

李伍举

军事医学科学院院刊,2002 vol.26(1):73-76.

摘要:DNA 微阵列技术是继DNA重组技术、PCR扩增技术之后的又一重大生物技术。基于微阵列实验,可以同时观察在某一生命现象中成千上万个基因的动态表达水平。与过去的研究模式即单个基因的表达研究相比,分子生物学工作者的观念将由此发生巨大改变,使得人们能够在基因组水平上以系统的、全局的观念去研究生命现象及其本质。目前,微阵列技术已应用到肿瘤分型、肿瘤分类、基因功能研究、基因之间调控网络构建、药物靶位识别等许多方面,但是,从本质上讲,通过微阵列实验所直接获得的是一个基因表达谱(即基因表达矩阵,行表示基因,列表示实验样本),微阵列的实际应用就是通过对基因表达矩阵的生物信息学处理来实现的,因此,在由微阵列技术为基础的分子生物学研究中,生物信息学是其中极其重要的一环,本文就与基因表达谱相关的生物信息学方法作一综述。

 

 

4.人N-甲基-D-门冬氨酸受体主亚基受体激活相关多肽的理化特性与抗原性分析

孙长凯. 赵杰. 李伍举. 冯健男. 刘淑红.等

中华医学杂志 2002 vol.82(1):50-53

摘要 目的:分析人N2甲基2D2门冬氨酸受体(NMDAR)主亚基NR1a上两个受体激活相关多肽P1、P2的抗原性及其理化特性。方法:用GOLDKEY软件从蛋白质数据库中调出人NR1a分子的氨基酸序列,分别在其第一、第三跨膜域前后逆向、顺向截取151和144个氨基酸长度的多肽片段P1与P2,选取 Hopp&Woods与Kyte亲水性、Janin表面可及性、Karplus2Schulz主链柔韧性及Welling抗原性等参数予以多参数分析,采用Prosite程序与Chou2Fasman方法比较其氨基酸位点与二级结构特征,以此为基础综合判定P1与P2片段的抗原位点并与已有的实验结果相比较。结果:P1、P2多肽片段上可能分别有6和7个8~15aa长序列具有良好的抗原性。P1相关序列主要分布于其氨基端,与配体结合关键氨基酸残基相距较远。 P2上的相关序列分布较均匀,包含有受体激活重要相关位点或与配体结合关键氨基酸残基距离较近。P2片段的总体抗原性、亲水性与可及性均强于P1,尤以其近膜的15个残基为著。P1、P2多肽片段均含有一定数量的β2转角,但P1片段含有较多的半胱氨酸残基和无规卷曲,而P2片段则含有较多的芳香族残基并以α螺旋结构为主。结论:人NMDAR主亚基NR1a上的两个受体激活相关多肽P1、P2均具有一定数量的抗原位点,与P1相比较,P2可能更易成为 NMDAR免疫干预的分子靶点。

 

 

2001年

1.Feature (gene) selection in gene expression-based tumor classification

Xiong M, Li W, Zhao J, Jin L, Boerwinkle E.

Mol Genet Metab. 2001 vol.73(3):239-47.

Abstract: There is increasing interest in changing the emphasis of tumor classification from morphologic to molecular. Gene expression profiles may offer more information than morphology and provide an alternative to morphology-based tumor classification systems. Gene selection involves a search for gene subsets that are able to discriminate tumor tissue from normal tissue, and may have either clear biological interpretation or some implication in the molecular mechanism of the tumorigenesis. Gene selection is a fundamental issue in gene expression-based tumor classification. In the formation of a discriminant rule, the number of genes is large relative to the number of tissue samples. Too many genes can harm the performance of the tumor classification system and increase the cost as well. In this report, we discuss criteria and illustrate techniques for reducing the number of genes and selecting an optimal (or near optimal) subset of genes from an initial set of genes for tumor classification. The practical advantages of gene selection over other methods of reducing the dimensionality (e.g., principal components), include its simplicity, future cost savings, and higher likelihood of being adopted in a clinical setting. We analyze the expression profiles of 2000 genes in 22 normal and 40 colon tumor tissues, 5776 sequences in 14 human mammary epithelial cells and 13 breast tumors, and 6817 genes in 47 acute lymphoblastic leukemia and 25 acute myeloid leukemia samples. Through these three examples, we show that using 2 or 3 genes can achieve more than 90% accuracy of classification. This result implies that after initial investigation of tumor classification using microarrays, a small number of selected genes may be used as biomarkers for tumor classification, or may have some relevance in tumor development and serve as a potential drug target. In this report we also show that stepwise Fisher's linear discriminant function is a practicable method for gene expression-based tumor classification.

 

 

2000年

1.Computational methods for gene expression-based tumor classification

Xiong M, Jin L, Li W, Boerwinkle E.

Biotechniques. 2000 vol.29(6):1264-8,1270.

Abstract: Gene expression profiles may offer more or additional information than classic morphologic- and histologic-based tumor classification systems. Because the number of tissue samples examined is usually much smaller than the number of genes examined, efficient data reduction and analysis methods are critical. In this report, we propose a principal component and discriminant analysis method of tumor classification using gene expression profile data. Expression of 2000 genes in 40 tumor and 22 normal colon tissue samples is used to examine the feasibility of gene expression-based tumor classification systems. Using this method, the percentage of correctly classified normal and tumor tissue was 87.0%. The combined approach using principal components and discriminant analysis provided superior sensitivity and specificity compared to an approach using simple differences in the expression levels of individual genes.

 

 

1998年

Li Wu Ju, Lei Hong Xing, Pei Wu Hong and Wu Jia Jin

Bioinformatics 1998, vol.14: 884-885.

RESULTS: Based on the mathematical model of high-level expression of heterologous genes in prokaryotic vector pBV220, we developed a program GeneDn for high-level expression design of natural and synthetic genes. AVAILIBILITY: The program is written in Turbo Pascal 7.0. The source code and related material are available upon request.

Full Text Download:

 

 

Li Wuju and Wu Jiajin

Bioinformatics 1998, vol.14: 700-706.

MOTIVATION: RNAs play an important role in many biological processes and knowing their structure is important in understanding their function. Due to difficulties in the experimental determination of RNA secondary structure, the methods of theoretical prediction for known sequences are often used. Although many different algorithms for such predictions have been developed, this problem has not yet been solved. It is thus necessary to develop new methods for predicting RNA secondary structure. The most-used at present is Zuker's algorithm which can be used to determine the minimum free energy secondary structure. However many RNA secondary structures verified by experiments are not consistent with the minimum free energy secondary structures. In order to solve this problem, a method used to search a group of secondary structures whose free energy is close to the global minimum free energy was developed by Zuker in 1989. When considering a group of secondary structures, if there is no experimental data, we cannot tell which one is better than the others. This case also occurs in combinatorial and heuristic methods. These two kinds of methods have several weaknesses. Here we show how the central limit theorem can be used to solve these problems.

RESULTS: An algorithm for predicting RNA secondary structure based on helical regions distribution is presented, which can be used to find the most probable secondary structure for a given RNA sequence. It consists of three steps. First, list all possible helical regions. Second, according to central limit theorem, estimate the occurrence probability of every helical region based on the Monte Carlo simulation. Third, add the helical region with the biggest probability to the current structure and eliminate the helical regions incompatible with the current structure. The above processes can be repeated until no more helical regions can be added. Take the current structure as the final RNA secondary structure. In order to demonstrate the confidence of the program, a test on three RNA sequences: tRNAPhe, Pre-tRNATyr, and Tetrahymena ribosomal RNA intervening sequence, is performed.
AVAILABILITY:
The program is written in Turbo Pascal 7.0. The source code is available upon request.

Full Text Download:

 

 

1997年

1.pBV220载体中外源基因表达水平定量分析

李伍举,吴加金

病毒学报,1997,vol.13: 126-133.

摘要: 运用基于螺旋区随机堆积的RNA二级结构预测与密码子偏性计算等序列分析技术,分析了pBV220载体中携带的人白细胞介素2、人白细胞介素4等22个外源基因的表达水平。结果表明:5'端-30~39区域和3'端30~-39区域的二级结构自由能与表达水平具有显著的统计学意义;其次是3'端9bp的局部密码子偏性,SD序列与起始密码子ATG之间碱基数在8±3范围内与表达水平无显著关系。另外,运用判别分析方法构建了判别函数,判别符合率高达 95.5%