arrow
Volume 3, Issue 4
Some properties of matrix product and its applications in nonnegative tensor decomposition

J. Info. Comput. Sci. , 3 (2008), pp. 243-257.

Export citation
  • Abstract
In DNA related research, due to various environment conditions, mutations occur very often, where a mutation is defined as a heritable change in the DNA sequence. Therefore, approximate string matching is applied to answer those queries which find mutations. The problem of approximate string matching is that given a user specified parameter, k, we want to find where the substrings, which could have k errors at most as compared to the query sequence, occur in the database sequences. In this paper, we make use of a new index structure to support the proposed method for approximate string matching. In the proposed index structure, EII, we map each overlapping q-gram of the database sequence into an index key, and record occurring positions of the q-gram in the corresponding index entry. In the proposed method, EOB, we first generate all possible mutations for each gram in the query sequence. Then, by utilizing information recorded in the EII structure, we check both local order (i.e., the order of characters in a gram) and global order (i.e., the order of grams in an interval) of these mutations. The final answers could be determined directly without applying dynamic programming which is used in traditional filter methods for approximate string matching. From the experiment results, we show that our method could outperform the (k + s) q- samples filter, a well-known method for approximate string matching, in terms of the processing time with various conditions for short query sequences.
  • AMS Subject Headings

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{JICS-3-243, author = {}, title = {Some properties of matrix product and its applications in nonnegative tensor decomposition}, journal = {Journal of Information and Computing Science}, year = {2024}, volume = {3}, number = {4}, pages = {243--257}, abstract = { In DNA related research, due to various environment conditions, mutations occur very often, where a mutation is defined as a heritable change in the DNA sequence. Therefore, approximate string matching is applied to answer those queries which find mutations. The problem of approximate string matching is that given a user specified parameter, k, we want to find where the substrings, which could have k errors at most as compared to the query sequence, occur in the database sequences. In this paper, we make use of a new index structure to support the proposed method for approximate string matching. In the proposed index structure, EII, we map each overlapping q-gram of the database sequence into an index key, and record occurring positions of the q-gram in the corresponding index entry. In the proposed method, EOB, we first generate all possible mutations for each gram in the query sequence. Then, by utilizing information recorded in the EII structure, we check both local order (i.e., the order of characters in a gram) and global order (i.e., the order of grams in an interval) of these mutations. The final answers could be determined directly without applying dynamic programming which is used in traditional filter methods for approximate string matching. From the experiment results, we show that our method could outperform the (k + s) q- samples filter, a well-known method for approximate string matching, in terms of the processing time with various conditions for short query sequences. }, issn = {1746-7659}, doi = {https://doi.org/}, url = {http://global-sci.org/intro/article_detail/jics/22764.html} }
TY - JOUR T1 - Some properties of matrix product and its applications in nonnegative tensor decomposition AU - JO - Journal of Information and Computing Science VL - 4 SP - 243 EP - 257 PY - 2024 DA - 2024/01 SN - 3 DO - http://doi.org/ UR - https://global-sci.org/intro/article_detail/jics/22764.html KW - databases, approximate string matching, DNA, mutation, similarity search AB - In DNA related research, due to various environment conditions, mutations occur very often, where a mutation is defined as a heritable change in the DNA sequence. Therefore, approximate string matching is applied to answer those queries which find mutations. The problem of approximate string matching is that given a user specified parameter, k, we want to find where the substrings, which could have k errors at most as compared to the query sequence, occur in the database sequences. In this paper, we make use of a new index structure to support the proposed method for approximate string matching. In the proposed index structure, EII, we map each overlapping q-gram of the database sequence into an index key, and record occurring positions of the q-gram in the corresponding index entry. In the proposed method, EOB, we first generate all possible mutations for each gram in the query sequence. Then, by utilizing information recorded in the EII structure, we check both local order (i.e., the order of characters in a gram) and global order (i.e., the order of grams in an interval) of these mutations. The final answers could be determined directly without applying dynamic programming which is used in traditional filter methods for approximate string matching. From the experiment results, we show that our method could outperform the (k + s) q- samples filter, a well-known method for approximate string matching, in terms of the processing time with various conditions for short query sequences.
. (2024). Some properties of matrix product and its applications in nonnegative tensor decomposition. Journal of Information and Computing Science. 3 (4). 243-257. doi:
Copy to clipboard
The citation has been copied to your clipboard