We performed the multi-year project to collect discharge summary from multiple hospitals and made the big text database to build a common document vector space, and developed various applications. We extracted 243,907 discharge summaries from seven hospitals. There was a difference in term structure and number of terms between the hospitals, however the differences by disease were similar. We built the vector space using TF-IDF method. We performed a cross-match analysis of DPC selection among seven hospitals. About 80% cases were correctly matched. The use of model data of other hospitals reduced selection rate to around 10%; however, integrated model data from all hospitals restored the selection rate.