Explore chapters and articles related to this topic
Scheduling Nested Transactions on In-Memory Data Grids
Published in Kuan-Ching Li, Hai Jiang, Albert Y. Zomaya, Big Data Management and Processing, 2017
Junwhan Kim, Roberto Palmieri, Binoy Ravindran
We consider Herlihy and Sun's dataflow distributed STM model [6], where transactions are immobile, and objects move from node to node. In this model, each node has a TM proxy that provides interfaces to its application and to proxies at other nodes. When a transaction Ti at node ni requests object oj, the TM proxy of ni first checks whether oj is in its local cache. If the object is not present, the proxy invokes a distributed cache coherence (cc) protocol to fetch oj in the network. Node nk holding object oj checks whether the object is in use by a local transaction Tk when it receives the request for oj from ni. If so, the proxy invokes a contention manager to mediate the conflict between Ti and Tk for oj.
Centric-Based Networking Systems
Published in M. Bala Krishna, User-Centric and Information-Centric Networking and Services, 2019
CCN [34] enables the routers to retransmit the dropped and pending data packets periodically. The lifetime of interest lists in the pending table is enhanced by the entry time for keeping the upstream alive, and until the request data is received. The pending interests are periodically retransmitted to consider the timeout requests of consumers. Content objects [35] match the identical data contents (frequently repeated words, phrases, songs, or pictures) from the distributed information among the neighboring servers. The composite information is derived using limited resources. Content security [36] is achieved by using the name sequence with implicit digest and public signatures. The forwarding NACKs verify the status of the next forwarding node, and the NACKs reduce the drawbacks of flooding. The progressive role of content sources in internet systems requires the autonomous content and cache systems [37] to coordinate and improve the peer business relations. The global performance of content-centric networks is achieved through multi-dimensional distributed cache systems based on (i) cache decision policies that coordinate with resources located at different places, (ii) the cache allocation matching the user requests, reducing the pending interests, and (iii) global stabilization of cache configurations. Content peering among the autonomous internet SPs exchange the cache summary, locally available messages, and share the most recent interest copies. The cache savings and cost minimization depend on the accuracy of the arrival time request. Cache synchronization is based on timely updates and achieves the uniform allocation of cache resources.
Big data analytics for retail industry using MapReduce-Apriori framework
Published in Journal of Management Analytics, 2020
Neha Verma, Dheeraj Malhotra, Jatinder Singh
The proposed MR-Apriori algorithm represents an enhanced version of popular Apriori algorithm using MapReduce platform. In the proposed algorithm, when a job is initialized at the master node (NameNode), then master node synchronizes the mapper tasks and the reducer tasks (Belbachir & Belbachir, 2012), where each map reads one block from the Hadoop distributed file system and generates the < key, value > pairs for the corresponding read file. Further, each reduces task generates sorted < key, value-list > pairs for the key for which reducer is responsible and then final output is written back to disk. At times it is advisable to use combiner function, which merges similar < key, value > local pairs to minimize communications between map and reduce tasks. Finally, the master node receives outputs from each MapReduce phase in a file and then synchronizes the DataNodes for subsequent phases. The Distributed Cache in HDFS is used to cache files; all DataNodes will read these files during execution. Moreover, the output of each phase will be written to disk in Edit Logs to make the system fault tolerant. The proposed system design is shown in Figure 3 and is discussed in detail as follows.
A haoop-based parallel mining of frequent itemsets using N-Lists
Published in Journal of the Chinese Institute of Engineers, 2018
Mohammad Karim Sohrabi, Narjes Taheri
The N-Lists of frequent items are saved in a distributed cache to be shared by all the processors. The set of all N-Lists of 2-itemsets can be also partitioned by their first item. Similar to the second phase, the generating process of one or some partitions will be assigned to each available processor. For example, if there are three processors generating N-Lists of 2-itemsets of the data-set of Figure 1, which are shown in Figure 4, a load balance assignment of N-Lists generation process can be as follows: processor 1 is responsible to generate N-Lists of 2-itemsets which start with d, processor 2 is responsible to generate N-Lists of 2-itemsets which start with f, and processor 3 is responsible to generate N-Lists of 2-itemsets which start with a, c, b, or e.